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Foreword 

Periodic  phenomona  in  biology  and  climatology  occur  so  widely  that  we 
tend  either  to  adapt  to  them  as  unavoidable  nuisances  or  are  overimpressed 
by  their  day  to  day  deviations.  We  can't  "see  the  forest  for  the  trees."  If 
the  variable  occurs  around  the  clock  or  through  the  year,  but  with  system- 
atically unequal  magnitudes,  its  underlying  pattern  can  often  be  expressed 
logically  in  relatively  simple  trigonometric  terms.  When  this  classic  math- 
ematical model  is  combined  with  an  appropriate  statistical  analysis,  we  are 
better  able  both  to  describe  the  periodic  trend  and  to  study  deviations  from 
its  pattern.  For  example  we  can  separate  weather  into  its  orderly  and  its 
random  elements  and  by  this  means  estimate  the  probability  of  occurrence 
of  critical  temperatures.  This  approach  is  sufficiently  novel,  even  to  biolo- 
gists and  climatologists  with  a  background  in  modern  statistics,  that  the 
technique  is  described  here  in  some  detail.  Its  applications  are  illustrated 
with  a  wide  range  of  biological  examples  and  a  more  detailed  study  of  a 
typical  climatological  series. 


Periodic  Regression 
in  Biology  and  Climatology 


C.  I.  BUss 


Most  non-linear  regressions  in  biology  and  many  in  climatology  are  handled 
in  one  of  two  ways.  The  first  is  to  convert  the  relation  to  a  straight  line  by 
the  selection,  on  either  theoretical  or  empirical  grounds,  of  a  suitable  unit 
for  each  variable,  such  as  its  reciprocal,  logarithm,  probit  or  logit.  A  sec- 
ond approach  is  to  fit  a  polynomial  equation  relating  the  dependent  variable 
y  to  successive  functions  of  the  independent  variable  x.  In  one  familiar 
form,  these  functions  are  the  powers  of  x,  leading  to  an  equation  of  the 
form 

Y  rrz  a  +  bix  +  box-  +  bax^  +  -  •  •  +  b^x"^  (1) 

Given  k  -f  1  values  of  our  independent  variable,  the  curve  defined  by  this 
equation  will  fit  exactly  the  mean  responses  yj  at  each  x,  if  extended  to  k 
powers  of  x.  In  practice,  we  terminate  the  series  as  soon  as  the  residual 
variation  of  y;  about  the  fitted  curve  is  comparable  with  the  variation  of 
the  individual  y's  about  their  respective  means. 

When  the  relation  between  x  and  y  is  periodic,  our  polynominal  equation 
will  be  more  rational  if  we  substitute  trigonometric  functions  of  x  for  their 
powers,  leading  to  harmonic  or  Fourier  analysis,  or  "periodic  regression" 
as  it  is  termed  by  Aitken  (1939).  The  problem  is  further  simplified  when 
the  independent  variable  x  is  cyclical  in  character  with  a  length  fixed  in- 
dependendy  of  the  response.  Typical  variables  include  the  hour  of  day  in 
the  diurnal  cycle,  the  month  or  week  in  the  annual  cycle,  and  the  compass 
direction  in  dispersion  from  a  center.  We  are  not  concerned  here  with 
cycles  determined  a  posteriori,  such  as  from  fluctuations  in  the  abundance 
of  animals  or  of  plant  pests,  nor  with  "cycles"  which  represent  an  age  trend 
in  a  single  group  of  individuals,  such  as  the  monthly  egg  production  from 
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a  single  set  of  pullets  through  the  year.  We  will  further  assume  that  each 
of  the  equally-spaced  subdivisions  in  the  cycle  is  represented  by  a  constant 
number  of  observations.  Within  these  restrictions,  periodic  regression  par- 
allels the  more  familiar  curvilinear  regression  in  which  the  orthogonal  poly- 
nomials represent  the  successive  powers  of  x. 


The  Sine  Curve 

Many  periodic  biological  functions  can  be  fitted  by  the  symmetrical  sine 
curve.  We  start  with  f  values  of  our  dependent  variable  y  at  each  of  k 
observed  times  t  (or  other  interval)  within  the  cycle.  The  expected  response 
Y  at  each  t  may  then  be  computed  from  the  sine  curve,  expressed  con- 
veniently in  the  form 

Y  =  a,,  +  A  cos(ct  -  0)  (2) 

where  a,,  =  y  is  the  mean  response  over  f  complete  periods  or  cycles.  The 
coefficient  A  is  the  semi-amphtude  or  one-half  the  range  from  the  maxi- 
mum to  the  minimum  Y.  The  constant  c  =  l-w/k.  converts  the  numbered 
units  of  time,  t  =  0,  1,2,...,  k-1,  in  a  single  cycle  to  angular  measure  in 
radians.  The  statistic  0  is  the  phase  angle  or  the  time  in  angular  measure 
of  the  maximum  response  Y.  It  shifts  the  origin  for  measuring  time  from 
an  arbitrary  starting  point  t^  to  the  time  at  which  the  response  is  a  maxi- 
mum. The  angles  could  be  measured  equally  in  degrees  instead  of  in 
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Figure  1.     The  sine  curve  and  its  constants. 
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Periodic  Regression  5 

radians,  but  radians  have  been  selected  here  as  the  more  convenient.  One 
complete  cycle  of  360''  =  Itt  =  6.283185  radians.  These  various  func- 
tions of  the  sine  curve  are  shown  graphically  in  Figure  1 . 

For  estimating  its  constants  from  the  observed  responses,  we  may  re- 
write Equation  2  as 

Y  =  a,,  +  a:Cos(ct)  +  b,sin(ct)  (3) 


an  equation  linear  in  the  adjustable  parameters  ai  and  b,,  where 


A  =  Var  +  br 


(4) 


and 


tan  8  r=  b]/ai 


(5) 


The  expected  response  Y  for  a  given  t  can  be  computed  directly  from 
Equation  3  without  conversion  to  the  original  form.  The  range  in  units 
of  y  is  equal  to  twice  the  semi-amplitude  or  2A.  To  determine  the  correct 


Figure  2.     Conversion  of  B'  =   |bi/ai|  to  the  phase  angle  B. 
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Table  1.     Cosines  (ui)  and  sines  (vi)  for  the  harmonic  analysis  of  cycHcal  data  recorded  in 
k  equally-spaced  fractions  per  cycle  and  numbered  consecutively  from  t  =   Otot  =   k— 1. 
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k  =   24 


t 

Ui 

Vi 

Uj 

V:; 

U3 

V3 

0 

1 

0 

1 

0 

1 

0 

1 

.6235 

.7818 

-.2225 

.9749 

-.9010 

.4339 

2 

-.2225 

.9749 

-.9010 

-.4339 

.6235 

-.7818 

3 

-.9010 

.4339 

.6235 

-.7818 

-.2225 

.9749 

4 

-.9010 

-.4339 

.6235 

.7818 

-.2225 

-  .9749 

5 

-.2225 

-.9749 

-.9010 

.4339 

.6235 

.7818 

6 

.6235 

-.7818 

-.2225 

-.9749 

-.9010 

-.4339 

k  = 


Ui 


Vi 


1 

.707 
0 
-.707 
-1 
.707 
0 
.707 


0 
.707 

1 
.707 
0 
-.707 
-1 
-.707 


U2 

V-j 

1 

0 

0 

1 

1 

0 

0 

-1 

1 

0 

0 

1 

1 

0 

0 

-1 

U3 

Vs 

U 

1 

0 

.707 

.707 

—  1 

0 

-1 

707 

.707 

—  1 

-1 

0 

707 

—.707 

—  1 

0 

1 

707 

-.707 

—  1 

k  =   12 


t 

Ui 

Vi 

U- 

V2 

U,. 

V3 

U4 

'V4 

0 

1 

0 

1 

0 

0 

1 

0 

1 

.866 

.5 

.5 

.866 

0 

-.5 

.866 

2 

.5 

.866 

-.5 

.866 

—  1 

0 

-.5 

-.866 

3 

0 

1 

-1 

0 

0 

—  1 

1 

0 

4 

-.5 

.866 

-.5 

-.866 

0 

-.5 

.866 

5 

-.866 

.5 

.5 

-.866 

0 

-.5 

-.866 

6 

-1 

0 

1 

0 

—  1 

0 

1 

0 

7 

-.866 

-.5 

.5 

.866 

0 

—  1 

-.5 

.866 

8 

-.5 

-.866 

-.5 

.866 

0 

-.5 

-.866 

9 

0 

-1 

-1 

0 

0 

1 

0 

10 

.5 

-.866 

-.5 

-.866 

—  1 

0 

-.5 

.866 

11 

.866 

-.5 

.5 

-.866 

0 

—  1 

-.5 

-.866 

t 

Ui 

Vi 

U, 

V- 

0 

1 

0 

1 

0 

1 

.966 

.259 

.866 

.5 

2 

.866 

.5 

.5 

.866 

3 

.707 

.707 

0 

1 

4 

.5 

.866 

-.5 

.866 

5 

.259 

.966 

-.866 

.5 

6 

0 

1 

-1 

0 

7 

-.259 

.966 

-.866  - 

.5 

8 

-.5 

.866 

-.5   - 

.866 

9 

-.707 

.707 

0 

-1 

10 

-.866 

.5 

.5   - 

.866 

11 

-.966 

.259 

.866  - 

.5 

12 

-1 

0 

1 

0 

13 

-.966  - 

.259 

.866 

.5 

14 

-.866  - 

.5 

.5 

.866 

15 

-.707  - 

.707 

0 

1 

16 

-.5    - 

.866 

-.5 

.866 

17 

-.259  - 

.966 

-.866 

.5 

18 

0 

-1 

-1 

0 

19 

.259  - 

.966 

-.866  - 

.5 

20 

.5   - 

.866 

-.5   - 

.866 

21 

.707  - 

.707 

0 

-1 

22 

.866  - 

.5 

.5   - 

.866 

23 

.966  - 

.259 

.866  - 

.5 

For 

t  =  0-11 

and 

12-23: 

U-,Vl.  = 

=  Ui,V 

.  (k=12) 

U4,V4   = 

U2,V 

(k=12) 

For  t  =  0-7,  8-15,  16-23: 
U3,V3  =  Ui,Vi  (k  =  8) 


For  k  =  4:  u,,v,   =  u.,v.  (k  =  8,  t  =  0-3) 


Fork  =  6:  u,,v,   =  u.,v.  (k=l2,  t  =  0-5;  u.,v.  =  u,,v,  (k  =  12,  t  =  0-5) 
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quadrant  for  the  phase  angle  6,  we  first  determine  from  a  table  of  trigo- 
nometric functions  the  angle  in  radians  corresponding  to  tan  6'  =  |bi/ai|, 
and  from  the  signs  of  the  coefficients  ai  and  bi  convert  6'  to  the  phase  angle 
6  by  Figure  2  (Brooks  and  Carruthers,  1953).  Then  on  the  time  scale 
measured  from  t^^,  the  maximum  response  occurs  at  the  time  kO/l-w.  Since 
the  sine  curve  is  symmetrical,  the  time  for  the  minimum  is  one-half  cycle 
before  or  after  the  time  of  the  maximum. 

For  any  selected  series  of  k  equally-spaced  intervals  in  each  complete 
cycle,  the  cosines  and  sines  corresponding  to  the  successive  intervals  of 
t  =  0,  1,  2,  .  .  .  k-1  are  listed  in  the  columns  for  Ui  and  Vi  in  Table  1. 
Each  forms  an  orthogonal  set  of  independent  variates  (within  a  negligible 
rounding  error)  similar  to  the  orthogonal  polynomials  for  the  successive 
powers  of  x.  With  Ui  =  cos(ct)  and  Vi  =  sin(ct),  Equation  3  may  be 
written  as 

Y  =  a^  +  aiU:  +  biVi  (6) 

where  2ui  =  2vi  r=  2(uiVi)  m  0.  The  cosines  and  sines  in  Table  1  cover 
the  series  encountered  most  commonly  and  include  the  higher  harmonics 
required  for  the  Fourier  analysis  in  the  next  section.  Except  for  rounding 
errors,  which  usually  may  be  neglected,  the  denominator  of  ai  and 
of  bi  is  the  same  for  all  evenly-spaced  series  of  the  same  length  k,  or 
Sui"  =  2vi^  =^  ^k.  With  this  short-cut,  the  regression  coefficients  for  a 
single  measure  at  each  time  t   (f  =  1)  are  readily  computed  as 

ai  =  2(uiy)/2ui-  =  [uiyj/^k 

(7) 
and  bi  =  2(viy)/2vi^  =  [viy]/^k 

With  f  replicated  y's  at  each  t,  totalling  Tt,  the  regression  coefficients  are 
computed  directly  from  the  Tt's  as 

ai  =  2(uiTt)/f2ui-^  =  [uiTt]/ifk 

(8) 
and  bi  =  2(viTt)/f2vi^  =  [viTt]/^fk 

As  an  example  of  simple  periodic  regression,  we  may  fit  a  sine  curve 
to  the  monthly  mean  temperatures  in  New  Haven  (Table  2),  for  the  14 
years  from  July  1943,  when  the  Weather  Bureau  station  was  moved  to  its 
present  location  at  the  municipal  airport,  through  June  1957.  The  totals  Tt 
in  the  last  row  of  Table  2  were  multiplied  by  the  variates  Ui  and  Vi  in 
Table  1  for  k  =  12  to  obtain  by  Equation  8  the  regression  coefficients 
ai  =  1763.0944/84  =  20.9892  and  bi  =  292.7604/84  =  3.4852.  With 
these  coefficients  and  the  mean,  a^  =  8528.6/168  =  50.7655,  the  expected 
Y  for  each  month  has  been  computed  by  Equation  6  and  the  corresponding 
variates  Ui  and  v,  in  Table  1.  The  Y's  have  been  plotted  as  the  curve  in 
Figure  3,  together  with  the  observed  monthly  means  \\.  In  this  as  in  most 
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other  figures  the  first  few  months  have  been  repeated  at  the  end,  so  as  to 
emphasize  the  cychc  character  of  the  relation.  Inspection  indicates  a  good 
fit;  how  good  we  will  test  more  fully  in  a  later  section. 

From  these  records  the  seasonal  range  or  amplitude  in  the  mean  tem- 
perature at  New  Haven  is  2 A  =  2V20.9892-  +  3.4852-  =  42.553 °F  as 
estimated  from  the  sine  curve  by  Equation  4.  To  determine  the  time  of  the 
maximum  (Equation  5),  we  may  compute  tan  6'  =  3.4852/20.9892  = 
0.16605  and  from  a  trigonometric  table,  interpolate  the  angle  B'  corres- 
ponding to  this  tangent.  With  both  ai  and  bi  positive,  6  falls  in  the  first 
quadrant  (Figure  2),  so  that  0  r=  (/  =z  0.16455  radians  and  the  maximum 
temperature  is  reached  at  12  d/lir  =  1.9746/6.2832  =  0.3143  months 
from  our  starting  point  (tj  in  the  annual  cycle.  Since  t„  corresponds  to  mid- 
July,  this  places  the  maximum  temperature  in  New  Haven  approxi- 
mately at  July  25  over  these  14  years  and  the  minimum  six  months  later 
on  January  24.  These  estimates,  of  course,  are  subject  to  sampling  errors 
which  will  be  considered  in  a  later  section.  Apart  from  their  intrinsic  in- 
terest, they  permit  rewriting  the  prediction  equation  in  Equation  6  in  the 
form  of  Equation  2,  if  this  is  preferred,  as 

Y  =  50.765°  +  21.2766  cos(0.5236t  -  0.16455) 

where  t  is  the  number  of  the  month  (Table  1). 
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Figure  3.     Monthly  mean  temperatures  from  Table  2  fitted  with  a  sine  curve. 
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The  Fourier  Series 

The  plotted  means  may  not  define  as  symmetrical  a  relation  as  the  sine 
curve.  By  Fourier  analysis  we  can  add  the  higher  harmonics,  corresponding 
to  2,  3,  4  or  more  complete  cycles  in  the  basic  interval  covered  by  one 
cycle  of  the  sine  curve.  If  we  add  enough  terms  the  computed  curve  will 
fit  any  observed  series  exactly,  but  the  equation  then  has  little  meaning 
either  biologically  or  climatologically.  Our  objective  is  to  add  no  more 
terms  than  are  needed  to  reduce  the  variance  from  the  scatter  of  yt's  about 
the  fitted  line  to  the  same  magnitude  as  the  residual  error.  We  may  stop 
well  short  of  this  if  the  scatter  seems  essentially  random  even  though  its 
variance  is  significantly  larger  than  the  residual  variation. 

The  sine  curve  in  Equation  6  is  extended  with  additional  terms  to 

Y  =r  a„  +  aiUi  -\-  biVi  +  aou^  +  b.v.  +  asUa  +  h-sW-s  +  .  .  .         (9) 

where  U2  =  cos(2ct),  Vo  =  sin(2ct),  U3  =  cos(3ct),  V3  =  sin(3ct),  etc.  and 
each  pair  of  coefficients  ai  and  bi  is  computed  with  Equations  7  or  8,  re- 
placing Ui  and  Vi  by  u,  and  v,  for  i  =  1,  2,  3  .  .  .  successively.  The  Ui's  and 
Vi's  convert  the  scale  of  t  to  orthogonal  units  in  which  2(UiVi)  =  2(UiUj)  = 
2(ViVj)  =  0  where  i  v^  j.  There  is  the  additional  advantage  that  for  any 
given  k,  ^Ui^  =  2Vi-  =  1  k  for  all  values  of  i,  except  the  last  term  where 
k  is  even  and  then  2ur  =  k.  The  values  of  u,  and  Vi  for  the  first  terms  of 
the  Fourier  series  are  given  in  Table  1  for  k  =  4,  6,  7,  8,  12  and  24  sub- 
divisions per  cycle. 

A  seasonal  trend  which  is  not  a  simple  sine  curve  occurs  in  the  iodine 
value  of  butterfat  at  five  stations  in  central  Alberta,  Canada,  as  reported 
by  Wood  (1956).  Each  entry  in  Appendix  Table  1  represents  duplicate 
analyses  of  the  weekly  samples  of  butter  in  each  month  for  two  years  be- 
ginning in  April  1952,  or  an  average  of  17.3  determinations.  Both  the 
annual  total  for  each  station  and  the  month  with  the  peak  reading  tended 
to  shift  in  going  south  from  Edmonton  to  Calgary.  According  to  Wood, 
the  monthly  readings  in  the  two  years,  which  have  been  averaged,  did  not 
differ  significantly.  Although  a  shift  in  the  phase  angle  from  one  location 
to  another  accounts  for  part  of  the  complexity  of  the  average  curve,  the 
iodine  values  for  each  location  could  not  be  fitted  adequately  with  a 
separate  sine  curve. 

From  the  sums  of  products  of  Tt  with  the  cosines  (ui)  and  sines  (Vj)  in 
Table  1  for  the  first  three  harmonics,  the  seasonal  trend  of  the  means  in 
the  upper  part  of  Figure  4  is  reproduced  quite  faithfully  by  the  equation: 

Y  =  36.955  +  0.409 hit  +  1.7318v,  +  0.0700u2  -  0.5542vo 
+  0.2233uh  +  0.7467V3 
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This  curve  is  merely  the  overall  mean,  a„  =  36.955,  plus  the  deviations 
for  each  harmonic  in  each  month,  as  the  reader  may  verify  from  the  last 
three  rows  of  Appendix  Table  1.  The  Fourier  terms  have  been  plotted 
separately  in  the  lower  part  of  Figure  4  as  deviations  from  the  mean  a^, 
where  it  is  evident  that  they  define  successively  1,  2,  and  3  complete  cycles 
within  the  year. 

In  the  present  case  the  biological  implications  of  the  successive  har- 
monics are  by  no  means  clear.  Iodine  values  are  indicative  of  the  unsatu- 
rated fatty  acid  content  of  butter  and  are  expected  to  be  high  during  the 
grass  feeding  season  in  May.  As  noted  by  the  author,  the  peak  in  August 
and  September,  most  pronounced  in  the  North  and  decreasing  southward, 
was  unexpected.  Although  the  biological  information  gained  in  fitting  a 
Fourier  series  is  here  questionable,  the  example  has  served  its  primary 
purpose  of  demonstrating  that  an  apparently  irregular  curve  can  be  fitted 
by  harmonic  analysis  with  a  limited  number  of  constants. 


40 
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36 
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Three-  term 
Fourier  Curve 


Component     Cycles 


A  S         0  N         D 

Month 


Figure  4.  Mean  monthly  iodine  values  for  butterfat  from  Appendix  Table  1. 
The  sum  of  the  deviations  in  the  lower  three  curves,  added  to  the  mean  (ao), 
yields  the  three-term  Fourier  curve  in  the  upper  diagram. 
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Analysis  of  Variance 

The  analysis  of  variance  has  the  same  function  in  periodic  regression 
as  in  many  other  regression  problems.  The  variation  in  y  about  the  fitted 
curve  is  assumed  to  be  normally  distributed,  equally  variable  over  the 
length  of  the  cycle,  and  with  deviations  independent  of  each  other.  The 
selection  of  a  suitable  transform  may  aid  materially  in  achieving  these 
objectives,  as  we  shall  see  in  a  later  section.  A  more  troublesome  problem 
is  the  potential  dependence  between  successive  observations  through  a 
cycle.  Despite  the  formal  analogy  of  a  cross-classification  to  randomized 
blocks,  the  responses  in  each  row  represent  an  ordered  sequence  rather 
than  an  arrangement  upon  which  treatments  have  been  superimposed  at 
random. 

One  approach  is  to  fit  a  Fourier  series  to  the  column  means  and  com- 
pute a  serial  correlation  coefficient  from  the  successive  residuals,  as  de- 
scribed by  Anderson  and  Anderson  (1950).  In  a  time  sequence,  such  as 
of  weather  records  or  of  attack  rates  by  a  contagious  disease,  these  cor- 
relations are  often  significant.  An  alternative  approach,  more  consonant 
with  the  analysis  of  variance,  is  to  fit  a  separate  Fourier  series  with  a 
limited  number  of  terms  to  each  replicate.  The  interaction  of  rows  by 
columns,  or  of  replicates  by  periods,  is  then  subdivided  into  as  many  parts 
as  may  be  needed  to  remove  the  systematic  difference  between  the  trend 
in  each  series  and  the  mean  trend.  In  this  way  we  may  separate  the  com- 
posite interaction  into  cyclic  trends  and  residual  error.  The  same  argument 
holds,  of  course,  whether  replicates  represent  successive  cycles,  such  as  the 
years  in  Table  2,  or  sampling  locations  as  in  Appendix  Table  1 .  The  more 
nearly  these  separate  curves  define  the  periodic  trend  in  each  replicate 
with  the  fewest  terms,  the  more  nearly  will  the  residual  error  provide  an 
unbiassed  estimate  of  the  random  error. 

With  an  orthogonal  design,  the  calculation  is  very  similar  to  that  for 
randomized  blocks.  The  sum  of  squares  between  the  f  totals  T^  for  repli- 
cates, representing  successive  complete  cycles  or  different  locations,  cor- 
responds to  variation  in  the  statistic  a„  of  our  separately  fitted  series.  When 
these  totals  suggest  a  trend,  we  may  wish  to  isolate  its  linear  and  quadratic 
terms  to  test  its  form  and  significance.  The  sum  of  squares  between  the  k 
totals  Tt  for  each  interval  within  the  cycle  may  be  subdivided  progressively, 
beginning  with  ai  and  bi  for  the  first  harmonic  with  two  degrees  of  free- 
dom, and  following  with  the  second  and  higher  harmonics  from  the  Fourier 
series,  until  the  scatter  about  the  fitted  curve  contains  no  element  which 
we  can  isolate  with  profit. 

The  remaining  sum  of  squares,  the  interaction  of  replicates  by  measured 
intervals  within  the  cycle,  includes  not  only  the  random  error  but  also  the 
variation  from  replicate  to  replicate  of  each  harmonic  in  the  Fourier  series, 
in  so  far  as  these  represent  systematic  rather  than  random  deviations.  The 
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differences  between  cycles  in  the  first  harmonic,  with  2(f-l)  degrees  of 
freedom,  almost  certainly  should  be  isolated  and  tested.  In  deciding  how 
much  farther  to  partition  the  interaction,  our  most  useful  guide,  when 
available,  is  the  theoretical  or  expected  variance,  with  which  we  can  com- 
pare each  mean  square.  In  its  absence,  we  may  subdivide  the  interaction 
into  as  many  additional  terms  of  the  Fourier  as  have  proved  useful  in 
fitting  the  means  of  all  replicates.  This  rule  is  rough  at  best,  since  a  sig- 
nificant higher  term  may  repeat  itself  so  consistently  in  all  replicates  that 
it  will  not  remove  a  systematic  component  from  the  interaction.  Alter- 
natively, systematic  trends  in  the  individual  cycles,  corresponding  to  the 
second  or  higher  harmonic,  may  cancel  one  another  when  averaged  over 
all  replicates. 

Mathematical  model 

These  relations  may  be  reduced  to  more  concrete  terms  by  an  explicit 
mathematical  model.  A  single  variate  occurring  in  the  i^''  year  (or  replicate) 
and  the  j"'  month  (or  interval)  is  potentially  the  sum  of  a  number  of  ele- 
ments. An  element  with  a  subscript  i  has  the  same  value  through  a  given 
year  but  may  vary  from  year  to  year;  an  element  with  a  subscript  j  has  a 
fixed  value  for  a  given  month  but  may  vary  from  month  to  month;  an 
element  with  both  subscripts  is  specific  for  a  given  month  and  year.  With 
this  notation  each  individual  variate  y^  may  consist  of  the  following  terms 

Yu  =  (m  +  rO  +  (ai+a/i)uij  +  (i8i+b/i)vij  +  (a2+a2'i)u.j 

+  082+b/Ovoj  +  tj  +eij  (10) 

where  the  Latin  and  Greek  terms  in  parentheses  correspond  to  the  ex- 
pectations for  the  successive  statistics  of  the  two-term  Fourier  curve  in 
Equation  9  for  the  year  i,  and  (tj  +  e^)  represents  the  difference  between 
the  observed  value  y^  and  its  expectations  Yjj.  Greek  letters  stand  for  the 
expected  values  of  the  same  curve  fitted  to  the  monthly  means  over  all 
years  or  replicates,  tj  is  the  difference  between  the  observed  and  expected 
mean  for  a  given  month  (or  other  interval),  and  eg-  is  the  inescapable  nor- 
mal random  component. 

Our  null  hypothesis  is  that  each  intermediate  element  in  Equation  10 
(except  the  cosines  and  sines)  is  zero,  which,  if  true,  would  then  reduce 
to  Yu  =  f^  -{-  Cij.  If  a  single  two-term  Fourier  equation,  estimated  from  the 
total  Tt  for  each  month  (or  other  interval),  were  to  describe  the  phenomena 
adequately,  our  model  would  simplify  to 

yy  =  ^  +  aiUij  -h  ^iV,j  +    a,Uoj  +  /3,Voj  +  Cy 

all  other  elements  being  indistinguishable  from  a  true  value  of  zero.  A 
significant  variation  of  the  monthly  means  about  this  curve  would  require 
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the  term  tj,  which  might  represent  a  third  or  higher  term  in  the  Fourier 
series  or  discrepancies  common  to  each  rephcate  year  from  some  other 
source.  All  remaining  elements,  with  subscripts  i,  measure  the  differences 
from  year  to  year  (or  replicate  to  replicate)  in  successive  terms  of  the 
Fourier  equation. 

Calculation 

When  the  elements  in  Equation  10  are  rearranged  in  the  order  in  which 
their  variation  is  isolated  in  successive  rows  of  the  analysis  of  variance, 
we  have 

YU   =   /^   +   Ti    +    (a,U,j+/:?,V,j)    +     (a,U.j4-^,.V,j)    +  tj 

Row       9        1  2  3  4       (11) 

+  (a/iU,j+b/iVii)  +  (a/iU,.j+b/iV.j)  +  Cij 
5  6  7 

Separate  sums  of  squares  are  attributable  to  the  unique  combinations  of 
these  elements  enclosed  by  parentheses  in  Equation  11,  the  number  beneath 
each  term  identifying  the  row  in  the  analysis  of  variance.  Their  practical 
calculation  is  outlined  in  the  workform  of  Table  3,  which  may  be  reduced 
to  that  for  a  sine  curve  by  omitting  rows  3  and  6,  or  extended  with  addi- 
tional Fourier  terms.  Square  brackets  [  ]  designate  the  sum  of  the 
squares  or  products  of  the  factors  they  enclose  measured  from  their  re- 
spective means  as  the  origin,  i.e.  [y^]  =  2(y— y)-  =  2y-  —  2^y/fk,  or  cor- 
respondingly [uiy]  =  2{(ui— 111)  (y— y)  }  =  2(uiy)  since  Sui  =  0.  Its 
other  symbols  are  defined  above,  in  the  workform  or  in  Equations  7  and  8. 
For  each  sum  of  products  the  identity,  2[U]y]  =  [uiTt],  provides  a  useful 
check  on  the  arithmetic,  which  holds  similarly  for  the  products  with  V], 
U2,  V2,  etc.  The  sum  of  squares  in  each  row,  designated  as  Si  to  Sn,  is 
divided  by  its  degrees  of  freedom  (DF)  to  obtain  the  corresponding  mean 
square  (MS). 

When  a  given  pair  of  coefficients,  ai  and  b;,  varies  significantly  between 
replicate  curves,  its  harmonic  may  differ  in  amplitude,  in  phase,  or  in  both. 
Since  amplitude  and  phase  angle  are  computed  from  non-linear  combina- 
tions of  ai  and  bi,  their  relative  contributions  to  the  sum  of  squares  in 
row  5  or  6  cannot  be  separated  orthogonally.  However,  if  we  disregard 
phase,  we  can  estimate  the  total  variation  in  amplitude  from  replicate  to 
replicate  in  terms  of  a  single  y-  from  ^k2(A— A)-,  where  A  is  the  semi- 
amplitude  of  a  given  harmonic  in  a  single  replicate  (Equation  7)  and  A 
that  for  the  same  harmonic  in  the  average  curve  (Equation  8).  For  the 
first  harmonic  this  reduces  algebraically  to  the  sum  of  squares  defined  in 
row  10  of  Table  3.  The  difference  between  this  sum  of  squares  and  that 
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in  row  5,  S.-,— Si,,  =  Sn,  we  may  attribute  to  differences  in  phase.  A  sig- 
nificant variation  in  the  second  harmonic  in  row  6  can  be  subdivided 
similarly. 


Table  4.     Variance  components  for  the  expectations  of  the  mean  squares  (MS)  in 
Table  4,  where  each  MS   =   Si/DF. 


Row 

Expected  mean  square 

1 

a,           +    kar 

2 

a'~ 

+   ik(ar  +  br),    +   for                  +   ikf(ar  +  /3r) 

3 

o'  +   ik(a,r  +  b,r 

).                                 +    fa.-'    +   \ki{ar  +  tir) 

4 

a' 

+   for 

5 

a'~ 

+   ik(a,=  +  br). 

6 

c-  +   ik(a,r  +  b,r 

). 

7 

a~ 

Tests  of  significance 

From  our  model  in  Equation  11,  the  mean  square  in  each  row  of  the 
analysis  of  variance  contains  potentially  the  variance  components  in  Table 
4,  on  the  assumption  that  each  source  of  variation  about  the  average 
Fourier  curve  can  be  considered  a  random  variable.  Replicates,  for  ex- 
ample, are  assumed  to  be  equivalent  to  a  random  sample  of  complete 
cycles,  and  the  variation  of  replicates  by  each  term  in  the  Fourier  series 
to  represent  similarly  a  random  selection.  We  will  further  assume  that  any 
correlation  between  successive  observations  within  a  replicate  is  removed 
in  the  interaction  of  replicates  by  ai  and  bi  and  by  a2  and  bo  in  rows  5 
and  6  of  Table  3,  where  the  effect  of  each  pair  of  coefficients  is  symbolized 
as"(ai+bi)",  "(ao+bo)",  etc. 

Under  these  assumptions,  the  variance  components  are  essentially  the 
same  as  those  for  other  replicated  regressions,  whether  linear,  curvilinear 
or  harmonic.  The  components  for  regression  from  ai  and  b;  in  Equation  1 1 
are  designated  as  a»-  and  b*-  in  Table  4  and  converted  to  units  of  y-  by  the 
factor  Ik  =^  Sui-  =  Svr.  The  variance  components  <r-  with  subscripts  for 
replicates  (r)  and  time  (t)  are  already  in  units  of  y-,  as  is  the  random 
variance  a-  which  recurs  in  each  MS  and  may  be  an  undivided  composite. 

The  error  variance  for  a  test  of  significance  or  a  measure  of  precision 
depends  upon  which  of  the  relevant  components  in  Table  4  differ  effective- 
ly from  zero.  It  may  be  a  single  mean  square  or  a  linear  combination  of 
variances,  and  will  frequently  be  designated  as  s'-.  When  testing  the  null 
hypothesis  that  the  additional  component  is  zero  in  the  mean  square  Vi  in 
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row  i  =  1,  4,  5  or  6,  the  appropriate  s-  is  V;.  The  significance  of  each 
observed  F  :=  Vi  V7  is  determined  by  reference  to  a  table  of  F  or  the 
variance  ratio,  such  as  that  given  by  Fisher  and  Yates  (1957)  or  by  Pearson 
and  Hartley  (1954).  If  the  mean  square  for  scatter  about  the  fitted  average 
curve  in  row  4,  for  example,  is  significantly  larger  than  that  for  the  residual 
variation  in  row  7,  we  would  conclude  that  the  deviations  about  the  repli- 
cate curves  have  a  common  element. 

An  F  test  of  the  Greek  coefficients  in  rows  2  and  3  is  more  involved.  If 
the  scatter  in  row  4  or  the  interaction  in  row  5  or  6  should  prove  less 
than  or  negligibly  larger  than  the  random  error,  its  component  would  drop 
out  of  the  sum  in  row  2  or  3  of  Table  4,  and  the  remaining  components 
would  determine  which  single  mean  square  is  the  appropriate  error.  When 
both  the  scatter  in  row  4  and  the  interaction  of  the  first  or  second  harmonic 
with  replicates  are  significant,  the  appropriate  error  is  a  linear  combina- 
tion of  the  mean  squares  (Vi)  in  three  different  rows  (Anderson  and  Ban- 
croft, 1952).  For  the  effect  of  (ai  +  bi),  the  error  is  s^  =  V4  +  V.,  —  V7 
with  approximately  n'  degrees  of  freedom,  estimated  as 


(V4  +  V5  -  V^)^ 


(V4Vn4)  +  (VsVns)  +  (Vr/n-) 


(12) 


For  an  approximate  test  of  significance,  we  refer 

F  =  Vo/(V4  +  V5-Vr)  (13) 

to  a  table  of  the  variance  ratio  (F)  with  ni  =  2  and  no  =  n'  degrees  of 
freedom.  Similarly,  for  the  second  term  in  the  Fourier  series  the  error  is 
s^  =  V4  -f-  Vg  —  V7,  with  F'  and  n'  determined  by  Equations  12  and  13, 
replacing  subscript  5  by  subscript  6. 


Examples 

The  analysis  of  variance  in  Table  5  has  been  computed  from  the  monthly 
mean  temperatures  at  New  Haven  in  Table  2.  An  inspection  of  the  yearly 
or  replicate  totals  reveals  no  obvious  trend,  except  possibly  for  a  series  of 
warmer  years  in  the  middle  of  this  14-year  period.  Since  a  parabola  fitted 
to  the  Tj.'s  (not  shown  here)  did  not  approach  significance,  we  will  con- 
sider the  differences  in  T^  a  random  variable.  Their  mean  square  Vi  ex- 
ceeds the  interaction  V7  significantly  (P  <  0.02).  The  sine  curve  for  the 
monthly  totals  (Tt)  accounts  for  96.9%  of  the  total  sum  of  squares  and 
is  obviously  highly  significant.  Although  the  mean  square  for  the  second 
term  in  the  Fourier  series,  (a^  +  bo),  is  larger  than  the  scatter  around  the 
two-term  Fourier  curve,  its  error  depends  upon  the  significance  of  the 
mean  squares  in  rows  4  and  6. 


Connecticut  Experiment  Station        Bulletin  615 

Table  5.     Analysis  of  variance  of  the  monthly  mean  temperatures 
at  New  Haven,  Conn.,  in  Table  2. 


Row 

Term 

DF 

SS 

MS 

F 

1 

Between  years 

13 

138.95 

10.689 

2.19 

2 

Months,  effect  of  (ai+bi) 

2 

38026.32* 

19013.160 

1695 

3 

effect  of  (a, +  b:;) 

2 

40.07 

20.035 

2.20 

4 

scatter 

7 

57.76* 

8.252 

1.69 

5 

Years   X  Month  (ai  +  bi) 

26 

291.71 

11.220 

2.29 

6 

"      X        "       (a.  +  bO 

26 

237.17 

9.122 

1.87 

7 

X        "         scatter 

91 

445.08 

4.891 

8 

Total 

167 

39237.06 

9 

Correction,  Cm 

1 

432958.44 

10 

Year    X    Amplitudei 

13 

165.55 

12.735 

2.60 

11 

Year    X    Phasei 

13 

126.16 

9.705 

1.98 

12 

Year    X    Amplitude^ 

13 

132.05 

10.158 

2.08 

13 

Year    X    Phase. 

13 

105.12 

8.086 

1.65 

*  When  recomputed  with  2ui"   =    Svi"   =    5.999824  instead  of  their  expectations, 
ik   =    6,  these  SS  were  corrected   to  38027.43   and  56.64  respectively,  no  others 
differing  by  more  than  0.01. 


When  compared  with  the  interaction  V7,  both  the  first  and  second 
Fourier  terms  varied  significantly  from  year  to  year,  but  the  scatter  about 
the  average  curve  in  row  4  fell  within  the  acceptable  range.  This  last  result 
is  in  line  with  Craddock's  finding  (1955)  that  temperature  records  in  the 
northern  hemisphere  agree  quite  generally  with  a  two-term  Fourier  series. 
Both  the  scatter  in  row  4  and  its  interaction  with  years  in  row  7  might 
have  been  subdivided  by  adding  a  third  term  to  the  Fourier  series,  as  in 
fact  was  done,  but  without  a  significant  reduction  in  the  remaining  mean 
squares.  Since  V4  is  not  significant,  we  may  retain  our  null  hypothesis 
that  its  variance  component  a^  is  zero,  and  compare  the  mean  squares 
from  the  first  and  second  terms  in  the  Fourier  curve  for  the  14-year 
average  with  their  respective  interactions  by  years.  For  (a^  +  b2),  we  have 
F  =  20.305/9.122  =  2.20,  which  is  not  significant. 

To  separate  the  differences  in  amplitude  and  in  phase,  the  variations  of 
the  Fourier  curve  from  year  to  year  in  rows  5  and  6  have  been  subdivided 
in  the  last  four  rows  of  Table  5.  These  indicate  that  for  both  the  first  and 
second  harmonic,  the  amplitude  or  annual  range  differed  somewhat  more 
from  year  to  year  than  the  phase  or  date  of  the  maximum.  The  variation 
in  the  mean  monthly  temperature  will  be  considered  later  in  more  detail. 


=5^ 


Periodic  Regression 

Table  6.     Analysis  of  variance  of  the  average  monthly  iodine  values 
in  Appendix  Table  1. 
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Row 

Term 

DF 

SS 

MS 

F 

F' 

1 

Place 

4 

65.8543 

16.4636 

141.20 

2 

Months,    (a,+b,) 

2 

94.9890 

47.4945 

18.56t 

3 

" 

(a.  +  b..) 

2 

9.3625 

4.6812 

6.65tt 

3' 

" 

(a.  +  ba) 

2 

18.2217 

9.1108 

17.74 

4 

" 

scatter 

5 

2.5673 

.5135 

4.40 

5 

Place 

X    Month    (a,  +  b,) 

8 

17.2916 

2.1614 

18.54 

6 

" 

X       "          (a.  +  bO 

8 

2.4569 

.3071 

2.63 

7 

" 

X       "         scatter 

28 

3.2652 

.1166 

10 

Place 

X  Amplitudei 

4 

5.0955 

1.2739 

10.93 

11 

" 

X  Phasei 

4 

12.1961 

3.0490 

26.15 

ts-  =  2.5583,  n'  =   10.27;         ft  s=  =   0.7040,  n' 


7.62. 


From  the  analysis  in  Table  6  of  the  iodine  values  in  Appendix  Table  1, 
the  three-term  Fourier  curve  accounts  for  97.9%  of  the  variation  between 
the  monthly  totals;  there  would  be  little  point  in  adding  more  terms  to  the 
series.  The  five  creameries  or  replicates  differed  very  significantly  in  their 
means  and  in  the  first  harmonic  (ai  -\-  bi).  When  the  latter  (row  5)  was 
subdivided  between  amplitude  and  phase  (rows  10  and  11),  differences  in 
phase  proved  the  more  important.  The  interaction  of  place  with  the  third 
and  higher  terms  proved  so  nearly  equal  that  they  have  been  pooled  in 
estimating  the  random  error  in  row  7.  From  its  variance  components,  the 
error  for  testing  (aa  +  ba)  in  row  3'  is  the  mean  square  in  row  4.  Since  all 
random  components  in  the  mean  squares  for  (ai-j-bi)  and  {■a.-i-^-^-i)  are 
significant,  each  is  tested  in  terms  of  F'.  For  the  first  term,  F'  =  47.4945/ 
(0.5135  +  2.1614  -  0.1166)  =  18.56  and  the  divisor  (2.5583)  has  ap- 
proximately n'  =  2.5583-/(0.5135V5  +  2.1614-/8  +  0.1166-/28)  — 
10.27  degrees  of  freedom  by  Equation  10,  and  for  the  second  term  F'  = 
6.65  with  n'  =  7.62.  All  three  terms  of  the  curve  plotted  in  Figure  4  are 
clearly  significant. 

A  systematic  trend  from  replicate  to  replicate  may  be  illustrated  by  the 
progressive  change  in  the  standing  electrical  potential  (Burr,  1945)  of  an 
elm  tree,  which  varies  diurnally.  The  hourly  potentials,  as  read  from  the 
daily  record  for  eight  three-day  periods  from  August  1  to  25,  1953,  have 
been  coded  in  Appendix  Table  2  for  ease  of  analysis.  The  hourly  means 
(in  code)  have  been  fitted  with  the  two-term  Fourier  curve  (Equation  9): 

Y  =  49.964  -  6.605ui  -  15.084vi  +  1.357u2  +  1.146v2 


20  Connecticut  Experiment  Station        Bulletin  615 

Decoded,  the  estimated  average  potential  for  each  hour  is 

Y'  =  -66.654  +  2.202ui  +  5.028vi  -  0.452u,  ~  0.382v, 

which  has  been  plotted  as  the  solid  curve  of  Figure  5.  Except  for  a  slight 
flattening  at  the  upper  and  lower  limits,  as  if  limited  by  maximal  and 
minimal  potentials,  the  fit  seems  very  good;  how  good  we  can  determine 
from  the  analysis  of  variance  in  Table  7. 
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Figure  5.  Mean  hourly  potentials  in  an  elm  tree  fitted  with  a  two-term 
Fourier  curve  (solid  line)  and  with  a  sine  curve  (broken  line),  from  Ap- 
pendix Table  2. 

Over  this  period  of  25  days,  the  average  potentials,  all  initially  negative, 
decreased  progressively,  as  indicated  by  the  rise  in  T^  in  Appendix  Table  2. 
In  consequence,  the  variation  between  replicates  has  been  subdivided  into 
a  highly  significant  linear  trend  and  the  scatter  about  this  trend,  in  rows 

1  and  r,  with  the  latter  still  much  greater  than  the  random  error  in  row  7. 
This  trend  was  succeeded  toward  the  end  of  the  month  by  a  drastic  change 
in  the  diurnal  pattern,  possibly  in  response  to  the  prolonged  dry  spell  in 
that  August. 

Since  the  mean  squares  for  both  the  first  and  second  Fourier  terms  are 
so  much  larger  than  the  remaining  variation  between  the  hourly  means 
(row  4),  the  two-term  Fourier  curve  seems  to  fit  the  plotted  points  in 
Figure  5  better  than  the  simpler  dotted  sine  curve.  However,  the  interaction 
of  replicates  by  the  first  and  by  the  second  term  both  exceed  the  residual 
variation  so  considerably  that  the  significance  of  the  mean  squares  in  rows 

2  and  3  must  be  tested  by  F'.  By  this  criterion,  the  first  term  or  sine  curve 
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Table  7.     Analysis  of  variance  of  the  tree  potentials  for  the  eight  3-day  periods 

in  Appendix  Table  2. 


Row 

Variance  due  to 

DF 

ss 

MS 

F,  F' 

1 

Linear  trend  on  periods 

1 

20259.80 

20259.796 

21.37 

1' 

Scatter  about  trend 

6 

5687.82 

947.971 

176.33 

2 

Hours,    (ai  +  bi) 

2 

26030.12 

13015.062 

66.17t 

3 

(a.  +  b.>) 

2 

302.80 

151.398 

1.78tt 

4 

"         scatter 

19 

297.20 

15.642 

2.91 

5 

Period    X   Hour    (ai  +  bi) 

14 

2609.93 

186.424 

34.68 

6 

X      "        (a^+bO 

14 

1048.06 

74.861 

13.92 

7 

X     "       scatter 

133 

715.01 

5.376 

8 

Total 

191 

56950.74 

10 

Period    X    Amplitude, 

7 

1822.73 

260.390 

48.44 

11 

X    Phase. 

7 

787.20 

112.457 

20.92 

Ts- 


196.690,  n'  =    15.50;         ft  s=  =  85.127,  n'  =   17.53. 


is  highly  significant  but  not  the  second  term  (F'  =  1.78,  P  =^  0.20).  De- 
spite its  apparently  better  fit,  the  more  complex  curve  offers  no  real  ad- 
vantage in  describing  the  average  diurnal  variation  in  tree  potential.  As 
judged  from  Table  7,  in  studying  the  relation  between  the  daily  tree  poten- 
tials and  environmental  factors,  such  as  temperature,  cloudiness,  soil 
moisture  and  humidity,  the  hourly  readings  for  each  day  might  well  be 
replaced  by  the  first  five  constants  in  a  Fourier  series  (a^,  ai,  bi,  ao  and  b2) 
and  these  used  as  the  dependent  variables  in  a  comprehensive  analysis. 


Transformations  of  the  Variate 

In  meeting  the  assumptions  of  the  analysis  of  variance,  the  adoption  of  a 
suitable  unit  for  the  response  is  often  critical.  An  unsuitable  original  meas- 
urement or  count  can  often  be  transformed  to  a  unit  which  is  either  addi- 
tive or  has  a  variance  independent  of  the  mean.  In  fulfilling  one  require- 
ment we  frequently  meet  or  approximate  the  other  assumptions  in  the 
analysis  of  variance,  and  in  some  cases  acquire  an  expected  variance,  with 
which  the  observed  variation  can  be  compared. 

Sometimes  the  transformation  can  be  based  upon  past  experience  with 
the  variate  or  upon  a  biological  relation.  Thus,  if  we  expect  our  measure- 
ment to  change  proportionately  or  percentagewise  with  time,  such  as  the  in- 
cidence of  a  contagious  disease,  the  appropriate  unit  would  be  the  loga- 
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rithm  of  the  incidence.  If  the  initial  variable  is  the  number  of  occurrences 
or  individuals  in  each  unit  of  time,  its  distribution,  apart  from  the  periodic 
effect,  may  well  be  Poisson.  The  expected  variance  of  each  Poisson  count 
is  its  unknown  population  mean,  but  the  appropriate  transform,  the  square 
root  of  each  count,  has  a  constant  variance  of  0.25.  Our  data  may  be 
binomial  percentages  which  can  be  assumed  to  measure  indirectly  an  un- 
derlying threshold  response,  some  function  of  which  is  normally  distributed 
in  the  biological  population.  The  additive  transform  is  then  the  probit,  or 
the  unit,  usually  the  logarithm,  to  which  the  probit  is  linearly  related. 

Log-transforms 

Since  the  logarithms  of  many  biological  measurements  are  normally 
distributed,  the  logarithmic  transformation  should  be  of  equal  value  in 
periodic  regressions,  such  as  of  contagious  diseases  in  animals  and  plants. 
An  example  from  man  is  the  seasonal  variation  in  the  death  rate  from 
pneumonia,  as  recorded  in  the  monthly  reports  of  the  Metropolitan  Life 
Insurance  Company  (1945-1955).  The  month  of  September,  when  deaths 
are  near  a  minimum,  has  been  selected  here  as  the  starting  time  (tj  for 
each  annual  cycle  in  Appendix  Table  3,  where  each  monthly  rate  per 
100,000  has  been  transformed  to  its  logarithm,  a  unit  which  stabilizes  the 
variance  through  the  year.  The  log-death  rates  for  September  1945  through 
December  1949,  when  deaths  were  classified  by  the  5th  Revision  of  the 
International  List  of  Causes  of  Death,  have  been  adjusted  here  to  conform 
with  the  6th  Revision  used  subsequently  by  subtracting  from  each  earlier 


Figure  6.     Mean  monthly  log-death  rates  from  pneumonia  and  fitted  sine 
curve,  from  Appendix  Table  3. 
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log-death  rate  the  mean  difference  (0.235)  during  the  twelve  months  of 
1950  when  both  criteria  were  reported. 

The  sine  curve,  Y  =  1.2087  —  0.1647ui  +  0.0535vi,  has  been  com- 
puted with  Equation  8  from  the  monthly  totals  Tt  and  plotted  in  Figure  6. 
By  Equation  4  the  seasonal  range  in  the  mean  log-death  rate  is  more  than 
two-fold,  2A  =  0.3464  =  log(2.220).  By  Equation  5  and  Figure  2,  its 
maximum  at  tan  0'  =  0.32507  and  phase  angle  6  =  2.8273  radians,  cor- 
responds to  5.400  months  from  the  starting  point  of  each  annual  cycle  in 
mid-September.  This  places  the  maximum  death  rate  at  approximately 
February  25  and  the  minimum  six  months  later. 

Table  8.     Analysis  of  variance  of  the  log-death  rates  from  pneumonia 
in  Appendix  Table  3. 


Row 

Term 

DF 

SS 

MS 

F,  F' 

1 

Years,  trend  on  x, 

1 

.80964 

.80964 

246.09 

r 

trend  on  x, 

1 

.06494 

.06494 

19.74 

1" 

"        scatter 

7 

.02303 

.00329 

1.52 

2 

Months,  (a,  +b,) 

2 

1.79995 

.89998 

107.83t 

3 

(a.  +  b._.) 

2 

.00445 

.00223 

0.25tt 

4 

"          scatter 

7 

.03650 

.00521 

2.40 

5 

Years    X    Month    (ai  +  bi) 

18 

.09540 

.00530 

2.44 

6 

"        X        "         (a.  +  b.) 

18 

.10607 

.00589 

2.72 

7 

"        X        "        scatter 

63 

.13662 

.00217 

10 

Year    X   A. 

9 

.06990 

.00777 

3.58 

11 

"      X    Phase.. 

9 

.02550 

.00283 

1.31 

12 

"      X    A. 

9 

.07347 

.00816 

3.76 

13 

"      X    Phase. 

9 

.03260 

.00362 

1.67 

is-  =  0.008346,  n'  =   12.6;         ft  s=  =  0.008939,  n'  =   13.6. 


The  progressive  decrease  in  the  yearly  totals  (T^)  (Appendix  Table  3) 
has  been  fitted  with  the  linear  and  quadratic  orthogonal  polynominals,  Xi 
and  X2,  for  a  series  of  10  (Fisher  and  Yates,  1957).  This  parabola  accounts 
effectively  (97.4%)  for  the  trend  between  years,  as  judged  from  rows 
1  to  1"  of  the  analysis  of  variance  (Table  8).  A  similar  proportion  (97.8%) 
of  the  sum  of  squares  between  the  monthly  totals  (Tt)  is  absorbed  by  the 
harmonic  coefficients  ai  and  bj.  Since  the  mean  square  for  the  second 
harmonic  is  less  than  that  for  the  remaining  scatter,  little  would  be  gained 
by  adding  more  terms. 
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The  variation  from  year  to  year  in  both  of  the  first  two  harmonics  ex- 
ceeds the  remaining  interaction  with  years  significantly,  despite  the  disap- 
pearance of  the  2nd  harmonic  from  the  average  curve.  When  isolated  from 
the  residual  sum  of  squares  in  row  7,  the  mean  squares  for  the  higher 
terms  decreased  progressively,  but  in  the  absence  of  an  expected  error 
variance  with  which  to  compare  them,  they  have  been  pooled  in  the 
analysis.  As  judged  from  the  last  four  rows  in  Table  8,  the  first  two 
harmonics  were  considerably  more  stable  in  phase  from  year  to  year  than 
in  amplitude. 


Figure  7.     Mean  monthly  log-incidence  of  poliomyelitis  in  the  United  States 
with  two-term  Fourier  curve,  from  Appendix  Table  4. 

A  similar  analysis  of  another  contagious  disease  with  a  marked  seasonal 
incidence,  poliomyelitis,  reveals  a  different  pattern.  The  U.  S.  monthly 
incidences  per  million  have  beei  changed  to  logarithms  in  Appendix 
Table  4  (Serfling  and  Sherman,  1953,  1958)  and  analyzed  in  Table  9. 
Although  a  parabola  accounts  for  niuch  of  the  overall  difference  between 
years  (T^),  the  scatter  about  tKi^  treiid  (row  1")  is  here  far  larger  than 
that  about  the  annual  curves  (re  '  ' ).  Instead  of  a  simple  sine  curve,  the 
monthly  totals  (Tt)  define  the  two-term  Fourier  curve  in  Figure  7,  with 
both  terms  significant  and  the  equdion 

Y  =  1.8517  -  0.6397ui  +  0.4161vi  -  0.0252u2  -  0.0861v, 

This  increases  in  24  weeks  from  a  minimum,  approximately  on  March  23, 
to  a  peak  35  times  as  great  on  '^'^.ptember  7,  and  then  returns  in  the  fol- 
lowing 28  weeks  to  its  minimum.  Here  the  variation  in  both  terms  from 
year  to  year  is  about  equally  distributed  between  amplitude  and  phase. 
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Table  9.     Analysis  of  variance  of  seasonal  incidence  of  poliomyelitis 
in  Appendix  Table  4. 


Row 

Term 

DF 

SS 

MS 

F,  F' 

1 

Years,  linear  trend 

1 

3.49183 

3.49183 

10.34 

1' 

"        quadratic  curv. 

1 

2.86902 

2.86902 

8.49 

1" 

"        scatter 

12 

4.05418 

.33785 

94.45 

2 

Months,  (ai+bi) 

2 

52.40486 

26.20243 

377.83t 

3 

(a.+b.) 

2 

.72484 

.36242 

11.55tt 

4 

"          scatter 

7 

.14494 

.02071 

5.79 

5 

Years   X  Month   (ai  +  b,) 

28 

1.46223 

.05222 

14.60 

6 

"       X        "         (a.  +  bO 

28 

.39758 

.01420 

3.97 

7 

X        "        scatter 

98 

.35051 

.00358 

8 

Total 

179 

65.89999 

10 

Years   X   Amplitudci 

14 

.71986 

.05142 

14.37 

11 

"       X    Phase. 

14 

.74237 

.05303 

14.82 

12 

"       X    Amplitude, 

14 

.19934 

.01424 

3.98 

13 

"       X    Phase. 

14 

.19824 

.01416 

3.96 

ts==  =  0.06935,  n'  =  30.29;         ft  s= 


0.03137,    n' 


14.35. 


Square  root  transform 


\ 


The  advantages  of  a  theoretical  error  term  are  evident  in  the  square  root 
transformation  for  a  Poisson  variate.  Data  on  the  number  of  normal  human 
births  per  hour  have  been  assembled  b}  King  (1956)  from  the  records  of 
five  hospitals,  the  two  with  the  fewest  oirths  having  been  combined  in 
Appendix  Table  5  into  a  single  series  \A).  If  the  number  of  births  per 
hour  within  each  series  had  varied  enti^^tly  at  random,  we  would  expect 
its  24  values  to  follow  the  Poisson  dictribution  and  its  variance  to  equal 
its  mean.  Because  of  differences  in  the  nz^  oi  the  four  series  and  potentially 
in  the  hour  of  birth,  the  variance  has  been  stabilized  by  transforming 
each  number  of  births,  ranging  from  153  to  508,  to  its  square  root  (Bart- 
lett,  1936).  The  hourly  means  have  been  plotted  in  Figure  8  and  fitted 
with  the  sine  curve,  Y  =  18.3542  +  0.1085ui  +  1.3615vi. 

The  adequacy  of  a  simple  sine  curve  has  been  tested  by  the  analysis  of 
variance  in  Table  10  of  the  transformed  variates  y.  If  our  Poisson  hypo- 
thesis is  correct,  the  mean  square  for  errt".  in  row  7,  s-  =  0.242  with  63 
degrees  of  freedom,  should  not  differ  significantly  from  its  expectation 
0.25.  Since  the  agreement  is  excellent,  each  sum  of  squares  for  which  s^ 
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Figure  8.     Mean  hourly  incidence  of  births  in  five  hospitals  and  sine  curve, 
from  Appendix  Table  5. 


serves  as  the  error  becomes  a  x"  when  divided  by  0.25  and  has  the  same 
number  of  degrees  of  freedom  as  before. 

The  average  sine  curve  in  Figure  8  accounts  for  89.9%  of  the  variation 
in  the  means,  with  the  highest  birth  rate  at  6:12  a.m.  Although  the  re- 
maining scatter  is  significant  (x"  =  40.42,  P  —  0.007),  it  would  not  be 
reduced  appreciably  by  adding  the  second  term  in  a  Fourier  series.  Sepa- 
rate sine  curves  for  the  four  series  also  differed  significantly,  primarily  in 

Table  10.     Analysis  of  variance  of  the  hourly  frequency  of  human  births 
in  Appendix  Table  5;  x"  =  SS/0.25. 


Row 

Term 

DF 

SS 

MS 

x" 

P 

I 

Between  series 

3 

750.1977 

250.0659 

3000.79 

<.001 

2 

Hours,  effect  of  (a,+b,) 

2 

89.5420 

44.7710t 

<.001 

4 

scatter 

21 

10.1042 

.4812 

40.42 

.007 

5 

Series    X    Hour    (a,+b,) 

6 

7.2891 

1.2148 

29.16 

<.001 

7 

X       "       scatter 

63 

15.2561 

.2422 

61.02 

.55 

8 

Total 

95 

872.3891 

10 

Series    X    Amplitudei 

3 

5.3943 

1.7981 

21.48 

<.001 

11 

X   Phasei 

3 

1.8948 

.6316 

7.58 

.055 

tF' 


44.7710/1.4538    =    30.80,  n,   ==  2,  n'  =  8.2. 
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amplitude  and  relatively  little  in  phase.  The  larger  deviations  in  birth  time, 
or  its  recording,  in  row  4  tend  to  recur  in  all  four  series,  due  in  part.  King 
suggests,  to  similarities  in  hospital  routine.  Thus  the  recording  of  births 
may  be  delayed  by  the  nurses'  conference  between  7  and  8  a.m.  when  the 
staff  changes,  and  the  balanced  low  and  high  points  in  the  hours  starting 
at  3  and  4  and  at  7  and  8  p.m.  may  have  similar  explanations.  This  loca- 
tion of  observation  periods  when  the  recording  may  be  at  fault  is  another 
advantage  of  periodic  regression.  Because  of  the  significant  variance  com- 
ponents in  rows  4  and  5,  the  critical  test  for  (ai  +  bi)  in  the  average  sine 
curve  is  F'  =  30.80  with  an  error  variance  of  s-  =  1.4538  (n'  =  8.19) 
andP  <  0.001. 


P  rob  it  transform 

In  biossays  of  toxicants,  such  as  insecticides  or  fungicides,  and  of  drugs, 
the  susceptibility  of  the  test  organism  varies  so  commonly  and  usually  so 
unpredictably  that  a  reference  or  Standard  preparation  is  almost  invariably 
tested  concurrently  with  the  sample  or  Unknown.  The  variation  in  sus- 
ceptibility may  be  so  large,  however,  as  to  complicate  the  selection  of  a 
suitable  range  of  dosage  levels,  especially  when  the  response  is  a  binomial 
percentage.  In  an  extreme  example,  the  same  series  of  fungicidal  concen- 
trations might  kill  all  test  spores  at  one  season  and  none  at  another.  In 
either  case  the  experiment  would  be  valueless  as  an  assay.  If  the  spore 
susceptibility  were  to  vary  predictably  through  the  year,  the  concentra- 
tions could  be  so  adjusted  as  to  obtain  on  each  occasion  an  adequate  num- 
ber of  intermediate  mortalities  between  0  and  100  percent.  A  response  in 
which  the  seasonal  variation  has  been  studied  systematically  is  that  of  the 
toad  Bufo  arenarum  to  chorionic  gonadotrophin  (Penhos  et  al,  1954).  For 
two  years  40  male  toads  were  collected  in  the  field  on  the  first  of  each 
month  and  on  the  following  day  injected  in  four  lots  each  of  10  toads 
with  the  same  four  dosage  levels  of  the  International  Standard.  The  num- 
ber of  individuals  in  each  lot  which  reacted  positively,  by  releasing  sperm, 
is  recorded  in  Table  1 1 .  Not  more  than  one  dose  in  each  test  produced  a 
reaction  of  either  0  or  100  percent. 

Our  problem  is  to  predict  from  these  data  the  response  to  be  expected 
at  each  dosage  level  in  each  month  of  the  year.  As  an  all-or-none  reaction, 
we  would  expect  the  probit  for  each  percentage  to  be  linearly  related  to 
the  logarithm  of  the  dose,  as  indeed  proved  true.  The  first  step,  therefore, 
was  to  convert  each  percentage  between  zero  and  100  to  its  empirical 
probit,  and  to  estimate  the  provisional  slope  b  =  5.27  from  these  values 
on  the  assumption  that  all  24  curves  are  parallel.  From  these  parallel  pre- 
liminary curves  a  provisional  expected  probit  could  be  estimated  for  each 
lot  in  which  none  or  all  of  the  toads  reacted,  and  then  by  suitable  tables' 
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Table  1 1.     Number  of  toads,  Biifo  arenarum,  in  each  group  of  10  reacting  positively 

to  four  different  doses  of  chorionic  gonadotrophin  measured  in  international  units 

per  animal,  and  the  log-ED50  computed  from  each  test  and  from  the  average  sine 

curve.  (Penhos  et  al,  1954) 


1951-52 

1952-53 

Log-ED50 

Month 

No.(  +  ) 

at  dose 

Log- 

No.(  +  )  at  dose 

Log- 

from 

40 

30 

22.5      15 

ED50 

40 

30      22.5 

15 

ED50 

sine  curve 

Nov 

10 

8 

7         3 

1.272 

10 

9         6 

2 

1.288 

1.318 

Dec 

9 

7 

5         2 

1.358 

9 

7         6 

3 

1.325 

1.353 

Jan 

8 

6 

4         2 

1.404 

7 

5         3 

1 

1.471 

1.402 

Feb 

7 

5 

4          1 

1.452 

8 

6         4 

1 

1.420 

1.453 

Mar 

7 

5 

3          1 

1.464 

6 

3         2 

0 

1.552 

1.492 

Apr 

7 

4 

3         0 

1.502 

8 

6         2 

0 

1.471 

1.508 

May 

9 

2 

1         0 

1.536 

8 

4          1 

0 

1.520 

1.498 

Jun 

9 

5 

4         1 

1.420 

3 

5         4 

1 

1.437 

1.464 

Jul 

9 

6 

4         1 

1.405 

9 

6         3 

0 

1.439 

1.415 

Aug 

10 

7 

4         2 

1.353 

9 

7         3 

2 

1.388 

1.364 

Sep 

10 

8 

5         3 

1.304 

10 

7         4 

2 

1.356 

1.325 

Oct 

10 

8 

5         2 

1.322 

10 

8         4 

2 

1.339 

1.308 

(Fisher  and  Yates,  1957)  its  corresponding  working  probit.  This  com- 
pletes the  set  of  24  probits  at  each  of  the  four  dosage  levels,  their  sums 
leading  to  a  new  unweighted  provisional  slope  of  b  ==  5.704.  From  the 
sums  of  the  8  probits  for  each  of  the  12  calendar  months,  a  sine  curve 
could  be  computed  by  Equation  8  for  predicting  the  mean  probit  in  each 
month  as  Y  =r  4.9751  +  0.5327ui  —  0.2693vi.  With  the  provisional  b 
and  Y  it  was  a  simple  matter  to  calculate  the  expected  probit  for  each  of 
the  four  dosage  levels  in  each  calendar  month.  These  determine  the  weight- 
ing coefficients  w  and,  with  the  observed  proportion  of  positive  reactions  in 
each  lot,  the  working  probits  y  for  computing  the  maximum  likelihood 
estimates  of  the  24  curves.  (Bliss,  1952;  Finney,  1952) 

The  variation  in  y  about  the  24  t^eparately  computed  curves  was  well 
within  the  sampling  error,  2x"  =  15.84  for  38  degrees  of  freedom.  When 
tested  for  differences  in  slope,  the  curves  proved  satisfactorily  parallel 
(xi,^  =  4.28,  n  =  23)  with  a  combined  slope  of  b,.  =  5.4724.  Given  this 
slope  and  for  each  curve  its  weighted  mean  log-dose  x  and  probit  y,  the 
ED50  in  logarithms  has  been  determined  for  each  month  as  listed  in  the 
Table  1 1 .  The  sums  of  the  replicate  responses  in  the  two  years  were  then 
fitted  with  the  single  sine  curve 

Log-ED50  =  1.4083  -  0.08987ut  +  0.04462vi 
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with  which  the  expectations  in  the  last  column  of  Table  1 1  have  been 
determined.  The  computed  curve  and  the  observed  log-ED50  for  each 
month  have  been  plotted  in  Figure  9. 

In  an  analysis  of  variance,  the  log-ED50's  for  the  two  years,  agreed  in 
their  annual  means,  in  their  separately  fitted  sine  curves,  and  in  the  random 
scatter  about  these  curves.  An  expected  variance  was  then  determined  for 
each  log-ED50  from  the  sum  of  the  weights  (2w)  for  its  log-dose  probit 
curve  and  the  square  of  the  difference,  (y— 5)-.  These  varied  by  less  than 
7  percent  so  that  an  average  variance,  o-'-  =  0.001795,  could  be  based 
upon  two  means,  2w  =  18.775  and  (5—}')-  =  0.06742,  from  the  internal 
evidence  of  the  separate  monthly  determinations.  With  this  expected  error 
variance,  the  total  sum  of  squares  about  the  average  sine  curve  (from 
the  analysis  of  variance  of  the  log-ED50's)  could  be  converted  to  x"  == 
0.023881/0.001795  =  13.31  with  21  degrees  of  freedom. 
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Figure  9.     Log-ED50  for  gonadotroph  in  in  toads  in  24  successive  months 
and  annual  sine  curve,  from  Table  11. 

The  observations  in  Table  1 1  agree  so  well  with  our  mathematical  model 
that  the  three  constants  in  the  sine  curve  plus  the  combined  slope  b  provide 
an  adequate  description  of  the  response  of  this  species  to  gonadotrophin 
through  the  two  years  of  the  experiment.  Indeed,  the  three  main  sources 
of  variation  —  of  the  working  probits  y  about  the  24  straight  lines,  be- 
tween the  slopes  of  these  lines,  and  of  the  log-ED50's  about  the  sine 
curve  —  all  had  smaller  x''s  than  would  be  expected  binomially  and  were 
consistent  with  one  another.  When  totalled  over  all  sources,  2x^  =  33.425 
with  approximately  82  degrees  of  freedom,  after  allowing  for  each  probit 
with  an  expectation  of  less  than  0.5  positive  or  negative  response.  The 
probability  for  so  small  a  combined  x%  P  <  0.000,001,  is  well  outside  the 
range  attributable  to  our  initial  hypothesis  of  simple  binomial  variation. 

The  seeming  paradox  can  probably  be  traced  to  differences  in  the  in- 
herent sensitivity  of  the  field-collected  experimental  animals.  If  on  a  given 
day  these  represented  several  collecting  points  with  unequal  thresholds  of 
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response,  and  if  the  toads  from  each  location  were  assigned  equally  or  at 
random  to  four  test  groups,  each  group  of  n  toads  would  be  a  mixture 
from  several  populations  of  sensitivity.  For  a  given  dose  of  hormone,  the 
mean  of  the  p's  for  the  different  populations  would  have  an  unbiassed 
proportionate  response  p  but,  as  noted  by  Kendall  (1945),  its  variance 
would  be  reduced  from  the  binomial  npq,  as  assumed  in  probit  analysis, 
to  npq  —  nV(p),  where  V(p)  is  the  variance  in  p  between  populations. 
Why  this  would  reduce  the  observed  variance  may  be  illustrated  by  a 
hypothetical  extreme  case  in  which  half  of  the  toads  at  a  given  dose  were 
collected  from  a  field  population  of  resistant  individuals  which  never  re- 
acted and  the  other  half  from  a  different  source  of  very  susceptible  toads 
which  always  reacted.  Their  combined  response  would  always  be  exactly 
50  percent  with  a  variance  of  zero. 


Adjustment  by  Covariance 

A  biological  response  may  be  influenced  by  prior  or  concomitant  variables 
which,  though  measurable,  are  impossible  or  impracticable  to  control.  A 
climatic  factor,  for  example,  is  far  easier  to  measure  than  to  control,  and 
any  effect  it  may  have  upon  a  biological  response  can  then  be  adjusted  by 
covariance.  If  the  covariate  is  quite  unrelated  to  the  cyclic  pattern  of  the 
response  or  variate,  covariance  may  reduce  the  experimental  error  in  the 
response  and  strengthen  its  underlying  periodic  regression.  Alternatively, 
the  covariate  may  display  periodicities  so  similar  to  that  of  the  variate, 
that  covariance  greatly  reduces  or  eliminates  the  initial  periodicity  in  the 
response;  it  then  aids  in  interpreting  the  underlying  phenomenon.  In  either 
case,  the  adjustment  for  the  covariate  depends  primarily  upon  the  linear 
regression  of  the  response  y  upon  the  covariate  x  as  computed  from  the 
sums  of  squares  [x-]  and  of  products  [xy]  in  the  error  row  of  the  analysis. 

A  case  in  point  is  the  diurnal  variation  in  the  heat  exchange  of  cows 
reported  by  Thompson  (1954).  In  an  experimental  barn  under  close  en- 
vironmental control,  the  average  heat  exchange  was  determined  in  BTU's 
per  hour  for  six  animals  on  each  of  three  days.  These  measurements  were 
paralleled  by  a  record  of  the  humidity  expressed  as  pounds  of  water  per 
pound  of  dry  air,  the  mixing  ratio,  on  the  three  days  of  the  test,  in  all 
cases  at  an  average  temperature  of  50°F.  In  fitting  a  sine  curve  to  the 
initial  data,  Thompson  noted  that  the  humidities  seemed  not  to  follow  any 
periodic  pattern.  In  Appendix  Table  6,  the  individual  observations  of 
humidity  have  been  coded  and  the  BTU's  transformed  to  logarithms  (  —  3) 
with  a  gain  in  consistency. 

Three  columns  of  the  analysis  of  covariance  in  Table  12  are  sums  of 
squares  from  analyses  of  variance  of  the  covariate  [x-]  and  of  the  variate 
[y-j,  and  the  corresponding  sums  of  their  products  [xy]  in  which  the  num- 
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bers  formerly  squared  are  here  cross-multiplied,  all  other  operations  bemg 
identical.  Comparisons  of  the  mean  squares  from  rows  2  and  4  show  in 
the  column  for  [y-]  a  well-marked  sine  curve  (F  =  17.17)  in  terms  of 
the  heat  exchange  but  in  that  for  [x-J  no  trace  of  a  sine  curve  (F  =  0.34) 
in  terms  of  the  mixing  ratio.  In  consequence,  the  covariate  x  is  here  essen- 
tially an  environmental  rather  than  an  explanatory  adjustment.  The  second 
term  of  a  Fourier  series  fitted  to  the  heat  exchange  proved  negligible  and 
has  not  been  isolated  in  Table  12.  In  the  error  row,  representing  the  inter- 
action of  days  by  scatter,  the  highly  significant  linear  regression  of  y  upon  x 
(F  =  33.65),  accounts  for  100X0.014303/0.031714  =  45%  of  the 
unadjusted  error  in  the  log-BTU,  y. 


-   3.0     ^ 


J I  I I  I I I I L 


69  12  369  12  3  6 

pni.  a  m.  p  m. 


2.4 


Figure  10.  Log-BTU  exchange  in  cows  and  sine  curve  for  intervals  start- 
ing at  each  stated  hour,  adjusted  for  differences  in  relative  humidity,  from 
Appendix  Table  6. 


After  correction  for  the  covariate,  the  ratio  of  the  reduced  mean  square 
for  the  average  sine  curve  in  row  2  has  increased  relative  to  that  for  scatter 
in  row  4  (F  =  19.57).  However,  both  the  scatter  in  row  4  and  the  inter- 
action of  days  by  (ai  -|-  bi)  in  row  5  are  so  very  significant  (P  <  0.001 ) , 
that  the  appropriate  error  for  the  average  sine  curve  is  the  combination  of 
the  reduced  mean  squares  in  rows  4,  5  and  7,  s-  =  0.001708 -j-0. 004332 
—0.000425  =  0.005615  with  5.14  degrees  of  freedom,  from  which  F' 
=  5.95  and  the  true  significance  of  the  adjusted  curve  is  P  <  0.05.  The 
hourly  means,  adjusted  for  the  covariate  x  with  the  slope  by^  =  0.11068, 
have  been  plotted  in  Figure  10  with  the  adjusted  sine  curve,  Y  =  0.47793  — 
0.00736ui  -f  0.04255vi. 
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Precision  of  the  Computed  Curve 

The  statistics  of  the  Fourier  curve,  such  as  its  mean  amplitude  and  phase, 
are  estimates  subject  to  error.  In  considering  their  precision,  we  will  restrict 
ourselves  to  the  first  harmonic  or  sine  curve.  Of  the  several  sources  of 
variation  to  which  it  is  subject,  the  most  nearly  random  is  the  residual 
error  about  the  series  of  curves  fitted  separately  to  each  replicate  and 
designated  as  a-  in  Table  4.  A  second  source  is  the  scatter  of,  say,  the 
monthly  means  of  the  f  replicates  about  the  average  fitted  curve,  which 
involves  the  additional  variance  component  at".  A  third  source,  the  varia- 
tion between  the  sine  curves  fitted  to  each  replicate,  is  divided  between 
the  sum  of  squares  for  replicate  means  or  totals  (a^)  with  f-1  degrees  of 
freedom,  and  that  for  the  interaction  of  replicates  by  (ai-j-bi)  with  2  (f-1) 
degrees  of  freedom.  The  replicate  means  especially  may  include  a  system- 
atic element  which,  when  segregated,  leaves  an  essentially  random  com- 
posite of  (T-  and  (If,  as  in  the  analysis  of  the  tree  potentials  in  Table  7. 
For  predictions  from  the  average  curve  to  the  population  of  which  the 
replicate  equations  are  a  sample,  the  error  variances  for  a^  and  for  the 
regression  coefficients  ai  and  bi  rarely  contain  quite  the  same  components. 

Error  terms  for  each  statistic 

The  error  variance  of  each  statistic,  as  derived  by  large  sample  theory, 
is  in  terms  of  the  population  variance  a\  but  in  practice  is  solved  with  an 
estimated  s-  based  upon  the  mean  squares  in  an  analysis  of  variance.  The 
statistics  a^^,  ai  and  bi  in  Equation  3  or  6  have  error  variances  similar  to 
those  for  linear  regression  equations.  The  variance  of  a^,  is 

V(a„)  =  aVN  (14) 

for  N  values  of  the  variate  y,  where  our  estimate  of  o--  is  usually  the  mean 
square  between  replicates  in  an  analysis  of  variance.  In  common  v^th  the 
linear  regression  coefficient,  the  error  variance  of  ai  and  of  bi  is  a-  divided 
by  the  denominator  of  the  coefficient  or 

V(ax)  =  V(bi)  =  trVfSur  =  2aVfk  (15) 

where  f  is  the  number  of  replicates  at  each  of  k  intervals  in  the  cycle.  The 
estimate  of  o--  will  depend  upon  which  of  the  variance  components  defined 
in  Table  5  have  proved  significant  in  the  analysis  of  variance. 

The  functions  of  ai  and  bi  are  of  as  much  interest  as  the  coefficients 
themselves.  One  of  these,  the  semi-amplitude  A  ==  V^r  +  bi",  can  be 
shown  to  have  the  same  variance  as  the  coefficients  from  which  it  is  com- 
puted or 

V(A)  =  (T'/fSui^  =  2crVfk  (16) 
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These  variances  are  in  units  of  y-.  In  contrast,  the  variance  of  the  phase 
angle  B  =  bi  ai,  is  in  terms  of  radians  squared  and  is  estimated  as 

W{d)  =  2(7-/'fkA-  (17) 

This  can  be  converted,  of  course,  from  radians  to  units  of  the  original 
cycle.  The  square  root  of  each  variance  is  the  standard  error  of  its  statistic. 
When  computing  confidence  or  fiducial  limits  for  a  given  probability  1— P, 
the  standard  error  is  multiplied  by  the  corresponding  Student's  t  for  the 
degrees  of  freedom  n  in  the  estimate  of  &-. 

These  estimates  of  precision  may  be  illustrated  with  the  example  in 
Appendix  Table  2  on  the  diurnal  variation  in  the  standing  potential  of  an 
elm  tree,  which  includes  a  trend.  Since  each  variate  y  is  the  sum  of  the 
potentials  at  a  given  hour  on  three  successive  days,  coded  by  changing 
the  sign  and  subtracting  150,  reversing  the  code  and  dividing  by  3  con- 
verts each  y  to  the  original  unit.  Each  mean  square  in  Table  7  is  decoded 
by  dividing  by  3-.  Because  of  the  progressive  decrease  in  the  average 
potential  through  the  period  covered  by  the  data,  the  estimate  of  a^  and  its 
error  are  contingent  upon  the  date  for  which  the  equation  is  to  be  solved. 

For  any  day  (x)  from  August  1  to  25,  1953,  inclusive,  our  estimate  of 
the  position  of  each  curve  is  a„  r=  —60.359  —  0.4751  x.  With  this  pro- 
viso, the  variance  of  a^  is  computed  with  the  mean  square  for  the  scatter 
about  the  trend,  947.9706/9  =  105.3301  to  obtain  by  Equation  14, 
V(a„)  =  105.3301/192  =  0.54859.  At  the  mean  date,  x  =  August  13.25, 
the  standard  error  of  \  is  VO-5486  =  0.7407;  at  any  other  date  its  vari- 
ance would  be  increased  by  the  variance  of  the  slope  multiplied  by  (x-x)^. 
Whenever  the  variation  in  T^  defines  a  trend,  the  estimate  of  a^  is  subject 
to  a  similar  limitation. 

Since  the  mean  squares  for  both  scatter  and  the  interaction  of  replicate 
by  (ai+bi)  are  here  significant,  the  variance  of  the  regression  coefficients 
ai  and  bi  is  a  linear  combination  of  three  mean  squares,  s-  =  (15.6421 
+  186.4238  —  5.3761  )/9  =  21.8544  with  15.50  degrees  of  freedom 
(Equation  12).  The  regression  coefficients,  ai  =  2.2017  and  bi  =  5.0279, 
and  the  semi-amplitude,  A  =  V 30. 1275  =  5.4889,  have  identical  vari- 
ances: V(ax)  =  V(bi)  =  V(A)  =  2X21.8544/8X24  =  0.22765,  and 
a  standard  error  of  V 0.22765  =  0.47713. 

The  tangent  of  the  phase  angle  0  can  be  computed  without  smoothing 
error  from  the  coded  numerators  for  ai  and  bi  as  tan  6^  =  (  —  1448.035)/ 
(—634.103)  =  2.2836.  Since  bj  and  ai  are  both  positive  after  decoding, 
6  =  6'  and  the  phase  angle  is  6  =  1.1580  radians.  Multiplying  by  24/27r 
converts  6  from  radians  to  24x1.1580/6.2832  =  4.4234  hours,  as  meas- 
ured from  our  first  reading  at  midnight,  which  places  the  maximum  poten- 
tial at  4:25  a.m.  For  the  variance  of  6,  we  have  from  Equation  17  and 
the  variance  of  a,,  V(6)  =  0.22765/30.1275  =  0.007556.  In  terms  of 
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Figure  11.     Confidence  limits  for  the  coefficients  of  the  sine  curve  in  Figure  5. 
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hours  the  phase  angle  has  a  standard  error  of  24x0.08693/6.2832  = 
0.3320. 

Each  of  these  standard  errors,  with  approximately  15.50  degrees  of  free- 
dom, is  multiplied  by  Student's  t  =  2.1255  at  P  =  0.05  in  computing  the 
95%  fiducial  or  confidence  limits.  For  ai,  bi  and  A,  the  limits  are  2.1255X 
0.47713  =  1.0141  above  and  below  each  statistic.  Their  relations  are 
shown  conveniently  in  Figure  11,  where  bi  has  been  plotted  on  the  ordinate 
against  ai  on  the  abscissa,  and  the  clock  hours  are  indicated  on  the  half 
circle.  When  considered  independently,  the  two  regression  coefficients  are 
consistent  at  odds  of  19  in  20  with  any  value  of  the  parameter  falling  be- 
tween the  parallel  horizontal  or  vertical  lines  bounding  the  point  ai,bi. 
The  corresponding  interval  for  the  semi-amplitude,  the  length  of  the  solid 
line  from  zero  to  the  point  ai,bi,  is  defined  by  two  parallel  arcs  with  their 
centers  at  zero.  The  time  of  the  maximum  tree  potential  and  its  limits  are 
marked  by  projections  to  the  time  scale  on  the  half  circle. 


Composite  tests 

In  estimating  a  separate  interval  for  ai  and  for  bi,  which  would  include 
its  parameter  in  all  but  five  percent  of  trials,  we  would  reject  their  true 
values,  considered  jointly,  with  a  frequency  of  100(1  —  0.95-)  =  9.75 
percent.  A  more  comprehensive  approach  is  provided  in  Section  64  of 
"The  Design  of  Experiments"  by  R.  A.  Fisher.  If  ai  and  bi  are  estimates 
of  the  true  coefficients  ai  and  ^i,  the  following  inequality  holds  if  the  hy- 
pothesis is  not  to  be  contradicted  at  the  percentage  level  selected  for  the 
variance  ratio  F: 

(ai-ai)^  +  (h,-fi^y  ^2FsVikf  (18) 

where  F  is  the  tabular  value  with  Ui  =  2  and  n2  =  the  degrees  of  freedom 
in  the  relevant  error  variance  S".  The  denominator  converts  the  numerator, 
a  sum  of  squares  with  two  degrees  of  freedom,  from  units  of  a  single 
variate  y  to  that  of  the  regression  coefficients  ai  and  bi.  Any  pair  of  postu- 
lated regression  coefficients  ai  and  /3i  would  be  excluded  if,  in  the  quad- 
ratic form  at  the  left,  the  differences  were  to  exceed  the  limiting  sum  of 
squares  on  the  right  of  the  inequality. 

All  acceptable  values  of  the  parameters  ai  and  fti  then  fall  within  a  circle 
centered  at  the  point  ai,b],  which  also  defines  the  joint  limits  of  the  true 
amplitude  and  of  the  true  phase  angle.  Its  radius  is  the  square  root,  of  the 
right  side  of  the  above  inequality  or  \/2Fs-/^kf.  For  the  limits  of  the 
phase  angle,  the  radius  of  the  circle  for  any  given  probability  is  multiplied 
by  k/7rA  to  convert  it  to  the  scale  of  k  subdivisions  in  a  complete  cycle. 
The  circle  enclosing  all  acceptable  parameter  values  at  a  selected  level  of 
significance  may  be  drawn  in  a  diagram  not  unlike  Figure  11,  and  supple- 
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mented,  if  desired,  with  a  series  of  concentric  circles,  one  for  each  addi- 
tional probability. 

For  our  example  on  tree  potentials,  we  may  obtain  indirectly  from  the 
table  of  z  in  Fisher  and  Yates  (1957)  F  =  3.6572  for  the  5%  point  and 
F  =  11.1471  for  the  0.1%  point  at  ni  =  2  and  no  =  15.50.  Substituting 
F  r=  3.6572  in  Equation  18,  any  pair  of  postulated  coefficients  a\  and  ^i 
which  does  not  violate  the  inequality 

(2.2017  -  ai)-  +  (5.0279  -  f^x)-  ^1.6651 

would  be  admitted  at  the  5%  level  by  our  observations.  This  pair  of  values 
would  fall  within  a  circle  with  a  radius  of  V  1.6651  =;  1.2904.  Substituting 
F  for  the  0.1%  level,  we  would  have  a  larger  concentric  circle  with  a 
radius  of  2.2528.  These  two  circles  have  been  added  to  Figures  11. 


Finer  Adjustments 

Correction  for  length  of  month 

In  the  annual  cycles  that  we  have  been  considering,  the  variate  for  each 
month  has  been  given  equal  weight,  although  months  differ  in  length  by  as 
much  as  10%.  The  month  containing  the  maximum  or  minimum  variate 
has  been  estimated  with  an  "average"  month  of  1461/48  =  30.4375  days, 
and  the  date  within  the  selected  month  then  based  upon  its  length.  In  a 
paper  of  the  Meteorological  Research  Committee  (London),  Craddock 
(1955)  has  provided  an  adjusted  set  of  multipliers  which  allows  for  dif- 
ferences in  the  length  of  the  month.  With  these  multipliers,  the  coefficients 
for  a  two-term  harmonic  equation  can  be  computed  as  readily  as  with  the 
orthogonal  cosines  and  sines  in  Table  1.  For  computing  the  corrected  ex- 
pectations Yp,  he  provides  a  second  table  of  the  cosines  and  sines  for  each 
month.  Since  it  is  not  orthogonal,  his  equation  cannot  be  reduced  immedi- 
ately from  two  terms  to  one  term  or  extended  to  a  third  or  higher  term  as 
the  data  require.  For  describing  the  annual  course  of  the  mean  tempera- 
ture in  the  northern  hemisphere,  this  limitation  is  negligible,  since  Crad- 
dock has  found  that  a  two-term  Fourier  series  applies  quite  generally. 

As  an  indication  of  the  size  of  the  correction  with  relatively  precise 
data,  the  monthly  mean  temperatures  in  Table  2  have  been  fitted  by  both 
methods.  When  computed  from  the  totals  Tt  by  Equation  9,  weighting  each 
month  equally  and  with  exact  values  for  2Ui^  and  2Vi^  instead  of  their  ex- 
pectations, ^k  =  6,  we  have  the  two-term  harmonic  series: 

Y  =  50.7655  +  20.9898ai  +  3.4853bi  -  0.1060a.  +  0.6825b, 

starting  with  July  as  t^.  The  monthly  means  for  this  14-year  period  y  and 
their  expectations  Y  from  the  above  equations  are  listed  in  Table  13.  When 
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1'able  13.     Comparisons  of  the  observed  monthly  mean  temperatures  in  New  Haven 
for  14  years  (y)  and  their  predicted  values  from  two-term  Fourier  equations  com- 
puted without  weighting  (Y),  with  corrections  for  the  length  of  each  month  (Yc), 
and  with  the  weights  w  {Y^^)  in  Table  14. 


Observed 

Unweighted 

/ Differ 

ence 

between 

met 

Month 

ins            N 

y 

Y 

y-Y 

Yc-Y 

Yw-Y 

Jul 

72.4357 

71.6494 

.7863 

-.0071 

.1765 

Aug 

70.9286 

71.2234 

-.2948 

.0017 

.0077 

Sep 

64.1357 

64.9228 

-.7871 

.0185 

-.1965 

Oct 

54.8714 

54.3568 

.5146 

.0067 

-.2137 

Nov 

43.4714 

42.7508 

.7206 

-.0265 

-  .0064 

Dec 

32.6929 

33.6869 

-.9940 

-.0387 

.2358 

Jan 

29.9214 

29.6697 

.2517 

-.0059 

.2806 

Feb 

31.4571 

31.3837 

.0734 

.0370 

.0828 

Mar 

37.9214 

37.8963 

.0251 

-.0752 

-.1702 

Apr 

47.8000 

47.3861 

.4139 

-.0325 

-.2435 

May 

56.8714 

57.7040 

-.8326 

-.0081 

-.0841 

Jun 

66.6786 

66.5560 

.1226 

-  .0054 

.1308 

2(y-Y)^ 

=   4.04575, 

2(Yc-Y)^   =  0.01085,  2;(Yw- 

-Y)^ 

=  0.36917 

corrected  for  differences  in  the  length  of  the  successive  months  but  start- 
ing in  January  as  t^,  we  have  with  Craddock's  weighted  multipliers  the 
two-term  harmonic  equation: 

Y,  =  50.8623  -  19.7304ai  -  8.6060bi  -  0.4865a2  +  0.5148b2 

The  corrected  predictions  for  each  month  Y,.  were  then  computed  with 
Craddock's  parallel  table  of  cosines  and  sines  and  the  constant  a„  = 
50.8623. 

The  discrepancies  (Y^.— Y)  may  be  compared  with  the  deviations 
(y— Y)  of  the  observed  means  from  their  simpler  predictions  Y.  They 
are  clearly  of  a  different  order  of  magnitude.  Comparing  their  sums  of 
squares,  lOOSCY^— Y)  V2(y— Y)-  =  0.27  percent.  If  this  single  example 
can  be  considered  a  reliable  indicator,  the  discrepancy  due  to  computing 
the  Fourier  regression  coefficients  as  if  months  were  equal  in  length 
should  be  negligible  for  most  purposes. 


Periodic  Regression  39 

Variance  homogeneity 

A  second  discrepancy  between  theory  and  observation  may  be  traced 
to  our  assumption  of  equal  variability  at  successive  intervals  through  the 
cycle.  Climatologists,  for  example,  have  long  known  that  the  variation  in 
temperature  from  year  to  year  in  a  given  locality  is  greater  in  winter  than 
in  summer.  To  the  extent  that  this  inequality  represents  harmonic  varia- 
tion, either  in  amplitude  or  in  phase,  it  should  be  attributable  to  differences 
between  the  curves  fitted  separately  to  the  data  for  each  year.  If  this  ex- 
planation were  fully  effective,  the  deviations  of  the  observed  monthly 
means  from  the  fitted  annual  curves  should  be  of  the  same  magnitude  in 
each  month  through  the  year.  The  problem  is  important  in  predicting  the 
size  of  discrepancies  from  the  fitted  curve,  and  in  determining  the  best 
estimate  of  the  mean  curve  over  the  several  replicates. 

When  comparing  the  observed  temperature  in  each  interval  with  its 
expectation,  approximations  in  curve  fitting  which  are  entirely  edequate 
in  an  overall  analysis  may  prove  troublesome.  Sums  of  the  squared  in- 
dividual deviations  may  differ  from  their  counterparts  in  the  analysis  of 
variance  in  the  third  and  even  in  the  second  significant  figure  due  to 
apparently  negligible  rounding  errors,  especially  if  the  average  Fourier 
curve  absorbs  a  very  large  proportion  of  the  total  sum  of  squares.  As  in 
the  calculation  of  a  reciprocal  matrix,  a  good  numerical  check  may  de- 
pend upon  carrying  what  seems  initially  to  be  an  unreasonable  number  of 
decimal  places.  An  example  is  our  substitution  of  the  true  value  ,^k  for  Sur 
and  SVi"  in  the  denominator  of  the  Fourier  coefficients,  some  of  which  are 
irrational  numbers  rounded  to  three  decimal  places.  In  a  cycle  of  twelve 
subdivisions,  this  substitutes  ^k  =r  6  for  2ui-  =  2vi-  =  5.999824,  Suo^  = 
6  exactly  and  2v2-  =  5.999648,  the  sums  of  squares  of  the  rounded  coef- 
ficients. These  latter  values  have  been  used  in  the  following  analysis. 

Because  the  second  term  in  the  Fourier  series  has  varied  significantly 
from  year  to  year,  it  has  been  retained  in  a  closer  analysis  of  the  monthly 
mean  temperatures  in  New  Haven  in  Table  2.  As  a  first  step,  a  separate 
two-term  Fourier  equation  (Equation  9)  has  been  computed  from  the 
12  monthly  means  (y)  for  each  year.  Each  of  these  14  equations  was  then 
solved  12  times,  with  the  Ui  and  Vj  for  t  =  0  to  11,  leading  to  a  table  of 
predicted  means,  designated  here  as  y,  which  parallel  the  y's  in  Table  2. 
The  averages  of  the  14  y's,  one  for  each  month,  agreed  exacdy  with  the 
Y's  in  Table  13  computed  independently  with  the  two-term  equation  based 
upon  the  monthly  totals  of  the  y's,  Tf  Each  y  was  then  subtracted  from 
its  corresponding  observed  mean  temperature  y  in  Table  2,  to  obtain  the 
deviations  d  =  (y— y)  in  Appendix  Table  7.  These  total  zero  for  each 
year,  and  for  each  month  their  average  is  equal  to  the  difference  (y— Y) 
in  Table  13. 
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Table  14.  Observed  monthly  variances  (per  degree  of  freedom)  of  New  Haven 
mean  temperatures  for  (3'  — Y)-  from  the  observed  means  y  and  their  unweighted 
predictions  Y  in  Table  13,  V(y)  from  the  deviations  of  the  y's  in  Table  2  frpm  their 
column  means  y,  V(y)  from  the  deviations  (y  — y),  and  V(d)  from  the  deviations 
(y  — y)  in  Appendix  Table  7;  expected  standard  deviations  (SD)  from  the  sine  curve 
fitted  to  log-V(y);  weights  w  =  antiIog(l  —  log-V(d)  for  computing  the  weighted 
two-term  Fourier  curve  Yw  in  Table  13. 


(y-Y)= 

-  Observed 

V(y) 

variance  from  — 
V(y) 

SD  from 
log-V(y) 

Month 

V(d) 

w 

Jul 

14.840 

3.538 

7.348 

2.039 

1.683 

5.4 

Aug 

2.087 

3.401 

7.475 

1.309 

1.680 

4.8 

Sep 

14.866 

2.904 

4.162 

3.206 

1.853 

3.5 

Oct 

6.357 

4.564 

4.018 

5.123 

2.197 

2.4 

Nov 

12.465 

4.851 

10.678 

4.805 

2.679 

1.6 

Dec 

23.715 

12.170 

20.887 

8.533 

3.183 

1.2 

Jan 

1.521 

19.450 

21.260 

12.752 

3.519 

1.1 

Feb 

.129 

11.101 

13.342 

6.257 

3.525 

1.3 

Mar 

.015 

9.560 

11.171 

5.670 

3.198 

1.7 

Apr 

4.112 

5.832 

11.388 

2.432 

2.696 

2.6 

May 

16.636 

4.798 

6.577 

4.291 

2.212 

3.8 

Jun 

.361 

3.440 

4.537 

2.275 

1.861 

4.9 

Mean 

8.092 

7.134 

10.275 

4.891 

2.434 

34.3 

2(DF) 

7 

156 

65 

91 

=  T 

Four  variances  were  then  determined  for  each  month  in  units  of  the 
variance  of  a  single  monthly  temperature  y.  The  average  of  each  series  of 
variances  over  the  12  months  agreed  with  its  corresponding  mean  square 
from  the  analysis  of  variance,  in  several  cases  combining  sums  of  squares 
that  were  reported  initially  in  separate  rows.  The  series  of  variances  in 
Table  14  have  the  following  composition: 

Those  from  (y— Y)-  measure  the  discrepancy  of  the  observed  14-year 
average  for  each  month  from  that  computed  with  the  two-term  Fourier 
equation  for  all  14  years,  each  with  7/12  of  a  degree  of  freedom.  These 
deviations  would  be  absorbed  completely  by  the  remaining  terms  of  the 
Fourier  series  if  it  were  extended  to  the  limit. 

The  empirical  variances  V(y)  =  2(y— y)-/13  represent  the  variation 
of  the  14  y's  for  each  month  in  Table  2  about  their  observed  or  column 
mean  y.  They  show  a  marked  seasonal  trend.  Each  sum  of  squares 
2(y— y)-  with  13  degrees  of  freedom  has  been  divided  into  two  parts  to 
obtain  the  next  two  series  of  variances. 
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The  variances  V(y)  measure  the  variation  of  the  predicted  y's  about 
their  mean,  or  that  part  of  the  variation  in  each  month  which  is  attributable 
to  the  14  annual  two-term  Fourier  curves.  The  average  of  the  V(y)'s  with 
65  degrees  of  freedom  is  equal  to  the  mean  of  the  sums  of  squares  in  rows 
l-f-5+6  of  the  analysis  of  variance  (Table  6).  These  monthly  variances, 
each  with  65/12  =  5.4167  degrees  of  freedom,  absorb  part,  at  least,  of 
the  seasonal  trend  in  the  variance. 

The  variances  V(d),  averaging  less  than  half  of  the  V(y)'s,  represent 
our  nearest  approach  to  a  random  error.  They  have  been  computed  from 
the  differences  d  for  each  month  in  Appendix  Table  7  as  V(d)  = 
122(d— d)-/91,  each  with  91 ,12  degrees  of  freedom.  Their  mean  cor- 
responds in  the  analysis  of  variance  to  the  mean  square  in  row  7.  Although 
much  of  the  initial  seasonal  trend  in  the  variance  has  been  absorbed  by  the 
V(y)'s,  a  substantial  amount  still  persists. 

The  pattern  of  the  seasonal  trend  in  the  empirical  variance  V(y)  and 
in  its  two  components  in  Table  14  may  be  defined  periodically.  Since  the 
distribution  of  the  log-variance  is  approximately  normal  (Bartlett,  1947), 
the  following  sine  curves  have  been  fitted  to  their  logarithms  and  plotted 
in  Fig.  12: 

Log-V(y)  =  0.7727  --  0.3203u,  -  0.0888v,     (s^  =  0.01198) 

Log-V(y)  =  0.9490  -  0.2585ui  -  0.0662vi     (s^  =  0.02532) 

Log-V(d)  =  0.6071  -  0.3384ui  +  0.0165vi     (s^  =  0.02106) 

In  no  case  was  the  second  Fourier  term  significant.  By  Equation  4,  the 
semi-amplitudes  (A)  of  these  curves  are  respectively  0.3324±:0.0119, 
0.2668±0.0174,  and  0.3392±0.0158.  From  antilog  (2A)  for  each  series, 
the  smallest  expected  variance  in  the  mean  summer  temperature  would  be 
multiplied  by  a  factor  of  4.62  for  y,  3.42  for  y,  and  4.77  for  d  to  abtain 
the  largest  winter  variance.  From  the  phase  angle  for  each  curve,  the 
variances  were  maximal  on  January  31,  30  and  12  respectively. 


Weighted  periodic  curves 

The  variance  of  the  mean  temperature  differs  sufficiently  through  the 
year  from  the  equality  implied  in  our  initial  model,  that  a  weighted  analy- 
sis might  be  expected  to  improve  our  estimate.  Appropriate  weights  would 
be  the  reciprocals  of  the  expected  random  variance,  computed  from  the 
sine  curve  for  log-V(d)  as  w  =  antilogarithm  of  1  —  log-V(d).  These 
weights,  in  the  last  column  of  Table  14,  vary  from  1.1  to  5.4  and  resemble 
the  second  of  the  three  weighting  systems  suggested  by  Craddock  (1955) 
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Figure  12.  Seasonal  variation  in  the  logarithm  of  the  variances  in  Table  14, 
from  [y-]  =  2(y  — y)"  for  the  overall  deviations  in  the  monthly  mean  tem- 
peratures, and  from  its  components  [y-]  =  2(y  — y)"  and  [d-]  =  2(y  — y)", 
each  fitted  with  a  sine  curve. 


for  a  similar  purpose.  The  weighted  two-term  Fourier  curve,  computed  by 
partial  regression,  has  the  equation: 

Y,,.  r=  50.7655  +  20.9378ui  +  3.5002vi  +  0.1226uo  +  0.6028v2 

When  solved  with  the  cosines  and  sines  for  each  month,  the  weighted  mean 
temperatures  Y„.  differ  from  the  unweighted  expected  means  Y  as  shown 
by  the  differences  (Y^^-  Y)  in  the  last  column  of  Table  13. 
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The  sum  of  squares  from  these  differences  is  a  considerably  larger  frac- 
tion (9.12%)  of  2(y— Y)-  than  the  0.27  percent  for  the  corresponding 
differences  (Y^— Y).  Although  the  weighted  estimates  Y^^.  may  be  superior 
theoretically,  their  curve  requires  the  solution  of  a  reciprocal  matrix  and 
gives  considerably  more  weight  to  the  summer  than  to  the  winter  tempera- 
tures. From  a  commonsense  point  of  view,  one  may  question  whether  the 
weighted  estimates  are  as  satisfactory  climatologically  as  those  from  the 
unweighted  Fourier  equation,  to  which  each  month  contributes  equally.  Is 
it  wise  to  base  the  estimate  of  the  annual  curve  so  largely  upon  the  sum- 
mer months? 


Normality  of  temperature  deviations 

In  analyses  of  variance  of  periodic  regressions  we  tacitly  assume  not 
only  that  the  random  deviations  are  equally  variable  at  each  t  but  also 
that  their  distribution  is  normal.  Because  of  the  small  number  of  years 
in  our  climatological  example,  the  normality  of  the  deviations  d  has  been 
tested  graphically.  The  rankits*  for  a  sample  of  14  have  been  plotted  in 
Figure  13  against  the  deviations  for  each  month  in  Appendix  Table  7  in 
rank  order  and  each  fitted  with  a  straight  line.  Their  slopes  are  less  in 
winter  than  in  summer,  as  would  be  expected  from  the  seasonal  change  in 
the  variance.  If  the  distributions  are  normal,  the  plotted  points  should  not 
curve  systematically  from  the  computed  straight  lines.  To  test  whether  the 
trends  in  Figure  13  cancel  out,  the  deviations  may  be  averaged  for  each 
position  over  the  12  months  (i.e.,  the  largest  in  each  month,  the  next 
largest,  etc).  The  rankits  have  been  plotted  against  these  averages  in  the 
left  side  of  Figure  14  and  fitted  with  a  line  passing  through  0,0  with  a 
slope  of  1/s  =  0.5920,  where  s  =  V445.070¥7r2xl3  =  1.6891.  The 
close  agreement  with  a  straight  line  confirms  our  initial  hypothesis  that  the 
variation  about  the  two-term  Fourier  series  for  each  year  is  here  essentially 
normal. 

Two  aspects  of  periodic  regression  need  to  be  distinguished.  The  first  is 
the  harmonic  analysis  of  periodic  data  to  determine  their  underlying  pat- 
tern and  the  magnitude  and  nature  of  the  variation  to  which  this  pattern 
has  been  exposed.  The  second  problem  is  that  of  predicting  future  re- 
sponses from  our  present  data,  as  is  commonly  the  objective  in  climatology. 
Unless  the  constants  in  our  fitted  Fourier  curve  for  each  year  were  to  define 
a  trend  which  might  be  expected  to  continue,  and  climatologists  are  not 
agreed  upon  the  existence  of  these  trends,  the  prediction  of  future  tem- 
peratures would  have  to  be  based  upon  the  average  curve  for  past  years. 


*  A  rankit  is  the  expected  mean  deviation  for  each  rank  in  an  ordered  sample  of  a  given  size  from 
a  normal  population  with  a  mean  of  zero  and  a  standard  deviation  of  one  (Fisher  and  Yates, 
1957,  Table  XX). 
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The  error  in  our  prediction  would  then  involve  not  only  the  variation 
around  the  annual  curves,  which  seems  to  be  satisfactorily  normal  although 
not  constant,  but  also  the  variation  of  the  annual  curves  about  their  aver- 
age for  the  series  of  years.  When  these  two  sources  of  variation  are  com- 
bined, a  convenient  estimate  of  the  standard  deviation  for  each  month  in 


Deviations      y-y       in  Degrees    Falirenheit 

Figure  13.  Rankit  test  for  agreement  of  the  deviations  in  the  monthly  mean 
temperature  with  the  normal  distribution,  from  the  differences  d  =  (y  — y) 
in  Appendix  Table  7. 
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°F  is  SD  =.antilogarithm  of  ^(log-V(y)  )  in  Table  14,  from  the  equation 
of  the  upper  sine  curve  in  Figure  12.  There  is  no  assurance  a  priori,  how- 
ever, that  the  composite  variation  will  be  distributed  normally. 

For  a  graphic  test  of  normality,  the  deviations  (y— Y),  which  also 
include  the  differences  (y— Y),  have  been  computed  from  the  y's  in  Table 
2  and  the  Y's  in  Table  13.  These  were  ranked  in  order  for  each  month 
and  then  averaged  over  the  twelve  months  to  obtain  the  rankit  diagram  in 
the  right  side  of  Figure  14.  The  plotted  points  have  been  fitted  with  a 
straight  line  passing  through  0,0  with  a  slope  of  1/s  =  1/2.679.  Not  only 
is  the  slope  much  less  than  that  for  the  deviations  about  the  annual  fitted 
curves  in  the  left  side  of  the  figure,  but  the  points  themselves  describe  a 
trend  that  is  less  certainly  linear. 


Meon    Deviations    from    6-   (y-y) 

-3       -2-1-0          1           2         3 

2 

1           1           1           1           1           1           1 

9^ 

o/ 

/ 

X 

0 

y° 

1 

/ 

y 

y 

o/ 

/ 

/ 

/ 

y° 

/ 

^° 

0 
-1 

</                                                        o      / 

-      /              X 

/o                                                              0 

X 

2 

1       1       1       1 

1            1            1 

1     1      1 

-4       _3        -2        -I  0  I  2  3         4  5 

Mean    Deviotions    from   (y-Y) 

Figure  14.  Test  for  normality  of  the  ranked  deviations  (y  — y)  in  Figure  13 
average  over  the  12  months  (left  curve),  compared  with  a  similar  diagram  of 
the  average  deviations  (y  — )')  from  the  14-year  means  for  each  month  (y). 

Despite  their  limited  sensitivity  with  as  few  as  14  replicates,  the  numeri- 
cal measures  of  skewness  (gi)  and  kurtosis  (go)  have  the  advantage  of 
separating  these  two  types  of  non-normality  (Fisher,  1954).  Both  statistics 
are  normally  distributed  about  zero  with  a  standard  error  depending  only 
upon  the  size  of  the  sample.  They  have  been  computed  for  each  month 
from  the  distribution  of  the  observed  temperatures  y  about  their  monthly 
means  y;  neither  approached  significance  in  any  one  month  or  in  com- 
posite X"  tests  over  all  12  months.  On  the  off  chance  that  a  seasonal  trend 
might  still  be  discernable,  separate  sine  curves  have  been  fitted  to  the 
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twelve  monthly  values  for  gi  and  for  go.  Neither  periodic  trend  approached 
significance  (P  >  0.20),  but  their  minima  and  amplitudes  may  be  sug- 
gestive. The  curve  for  gi  had  a  minimum  in  December  and  an  amplitude 
of  0.507±0.338,  and  that  for  go  a  minimum  in  January  and  an  amplitude 
of  1.019±0.552.  In  developing  probability  statements  for  the  monthly 
mean  temperature  from  more  extensive  data,  we  may  need  to  consider 
not  only  seasonal  changes  in  the  standard  deviation  about  the  average  two- 
term  Fourier  curve,  but  also  seasonal  departures  from  normality. 

Summary 

Periodic  regression  is  applied  here  to  cyclic  phenomena  in  biology  and 
climatology  in  which  ( 1 )  the  length  of  the  cycle,  such  as  a  year  or  day,  is 
determined  independently  of  the  response,  (2)  the  observations  are  spaced 
evenly  through  the  cycle,  and  (3)  the  number  of  replicates  is  constant  at 
each  interval.  When  the  response  (y)  changes  symmetrically  through  the 
cycle,  the  first  harmonic  or  sine  curve  is  defined  by  the  mean  response 
(a^)  and  two  orthogonal  regression  coefficients,  ai  for  the  cosine  Ui  and  bi 
for  the  sine  Vi,  as  Y  =  a„  +  aiUi  +  biVi,  from  which  we  can  compute  its 
amplitude  and  phase  angle.  When  the  curve  is  not  symmetrical,  the  sine 
curve  can  be  extended  with  additional  terms  for  two,  three  or  more  cycles 
in  each  fundamental  period  by  classical  Fourier  analysis  until  the  desired 
fit  is  achieved. 

The  analysis  of  variance  for  deciding  how  many  terms  to  retain  in  a 
Fourier  curve  and  for  determining  its  error  is  based  upon  the  mathematical 
model  for  replicated  regressions.  In  effect,  a  Fourier  curve  is  fitted  to  each 
replicate  and  the  analysis  determines  in  what  respects  these  separate 
curves  differ  from  replicate  to  replicate.  Various  aspects  of  the  calculation 
are  illustrated  by  the  monthly  mean  temperatures  in  New  Haven  over  a 
14-year  period,  the  monthly  iodine  values  of  butterfat  from  five  creameries 
in  Alberta,  and  the  electrical  potential  of  an  elm  tree  in  eight  three-day 
periods  in  August,  1953. 

Both  the  number  of  terms  in  a  periodic  regression  and  the  validity  of 
its  analysis  depend  upon  the  selection  of  a  suitable  unit  for  the  response. 
The  transformation  to  logarithms  is  applied  to  monthly  data  on  two  con- 
tagious diseases.  The  square  root  transformation  for  counts  is  illustrated 
with  data  on  the  hour  of  birth,  where  agreement  with  the  assumed  Poisson 
variation  about  a  diurnal  sine  curve  can  be  tested  by  x"-  The  analysis  of 
seasonal  variation  in  the  log-ED50  for  a  biocide  or  a  drug  is  computed 
by  maximum  likelihood  from  all-or-none  data  with  probits.  A  periodic 
response  can  be  corrected  for  diflferences  in  a  concomitant  environmental 
factor  by  covariance,  as  illustrated  by  the  adjustment  for  aperiodic  hu- 
midity of  the  diurnal  variation  in  the  log-heat  exchange  of  cows  in  an 
experimental  barn. 
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The  precision  of  the  constants  for  the  first  harmonic  in  a  periodic  re- 
gression is  considered  from  two  viewpoints.  The  first  defines  the  variance  of 
the  statistics  of  a  sine  curve  and  the  confidence  hmits  of  their  parameters 
when  each  statistic  is  considered  separately.  The  second  defines  a  joint 
circular  region  within  which  any  combination  of  the  parameters  for  ai  and 
bi  is  compatible  with  our  observations  at  a  given  probability. 

Finer  adjustments  in  periodic  regression  are  examined  with  the  monthly 
mean  temperatures  in  New  Haven.  A  correction  for  differences  in  the 
length  of  the  month  proved  of  minor  importance  relative  to  other  errors 
in  fitting  a  two-term  Fourier  curve.  Seasonal  changes  in  the  variance 
through  the  year  could  be  divided  into  two  components,  one  representing 
differences  between  the  observed  monthly  temperatures  and  their  predic- 
tions by  annual  two-term  Fourier  curves,  and  the  other  differences  be- 
tween these  predicted  temperatures  and  the  average  two-term  Fourier 
curve  for  all  14  years.  For  each  source  the  log-variance  changed  periodi- 
cally through  the  year  in  a  sine  curve,  leading  to  estimated  standard  devia- 
tions for  probability  predictions  and  to  weights  for  recomputing  the  average 
Fourier  curve. 

A  distinction  is  drawn  between  two  objectives  in  periodic  analysis,  that 
of  locating  sources  of  variation  and  describing  their  characteristics,  and 
forecasting,  which  must  ordinarily  be  based  upon  the  average  over  all 
replicates  because  of  the  unpredictable  nature  of  long  term  trends.  The 
summer  temperatures  contributed  proportionately  more  to  the  weighted 
regression  than  the  winter  temperatures,  a  feature  which  may  be  poten- 
tially less  desirable  for  climatological  predictions  than  the  simpler  process 
of  equal  weighting  through  the  year.  In  graphic  tests  with  rankits,  the 
approximately  random  deviations  from  the  yearly  two-term  Fourier  curves 
proved  to  be  satisfactorily  normal  but  when  these  were  increased  by  the 
larger  differences  between  the  yearly  and  the  average  curves,  the  data 
suggest  that  seasonal  departures  from  normality  may  modify  probability 
predictions  based  upon  an  average  Fourier  curve. 
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Appendix  Table  2.     Hourly  standing  potentials  in  an  elm  tree  in  Lyme,  Connecticut, 
in  August,  1953.  (H.  S.  Burr,  1958) 


Hour 

1-3 

y  =  - 

4-6 

-22  (Daily  potential)  — 
8-10   11-13  14-16 

150  on 
17-19 

August 
20-22 

23-25 

Total 
Tt 

Mt. 

23 

30 

41 

38 

41 

43 

50 

76 

342 

1 

23 

29 

39 

33 

41 

42 

49 

71 

327 

2 

23 

24 

32 

32 

36 

41 

48 

68 

304 

3 

20 

22 

36 

31 

34 

39 

46 

65 

293 

4 

20 

22 

36 

28 

33 

38 

44 

58 

279 

5 

19 

22 

30 

30 

32 

41 

43 

57 

274 

6 

20 

22 

30 

28 

32 

42 

42 

57 

273 

7 

20 

22 

32 

30 

33 

42 

41 

57 

277 

8 

17 

31 

36 

37 

37 

44 

39 

57 

298 

9 

20 

39 

40 

49 

38 

41 

41 

59 

327 

10 

22 

51 

44 

58 

41 

50 

46 

65 

377 

11 

26 

53 

55 

64 

52 

54 

57 

75 

436 

12 

32 

57 

64 

69 

62 

56 

65 

79 

484 

1 

32 

60 

69 

70 

67 

58 

72 

79 

507 

2 

32 

63 

70 

70 

63 

61 

74 

79 

512 

3 

35 

63 

70 

70 

62 

65 

74 

81 

520 

4 

38 

58 

70 

70 

61 

66 

76 

80 

519 

5 

38 

58 

68 

70 

60 

68 

77 

80 

519 

6 

38 

63 

64 

70 

57 

68 

77 

78 

515 

7 

40 

63 

60 

67 

57 

67 

73 

78 

505 

8 

34 

57 

57 

56 

53 

62 

68 

78 

465 

9 

35 

55 

54 

51 

50 

56 

64 

79 

444 

10 

23 

54 

51 

49 

43 

49 

58 

80 

407 

11 

20 

54 

47 

48 

41 

45 

56 

78 

389 

Tr 

650 

1072 

1195 

1218 

1126 

1238 

1380 

1714 

9593 
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Appendix  Table  5.     Number  of  normal  human  births  in  each  hour  in  four  hospital 
series,  transformed  to  y  =    vNo.  of  births.  (King,  1956) 


y  in  Hospital 

Observed 

Hour 

V  births  = 

Expected 

starting 

A 

B 

C 

D 

Total 

y 

Y 

Mt     12 

13.56 

19.24 

20.52 

21.14 

74.46 

18.6150 

18.463 

AM      1 

14.39 

18.68 

20.37 

21.14 

74.58 

18.6450 

18.812 

2 

14.63 

18.89 

20.83 

21.79 

76.14 

19.0350 

19.129 

3 

14.97 

20.27 

21.14 

22.54 

78.92 

19.7300 

19.393 

4 

15.13 

20.54 

20.98 

21.66 

78.31 

19.5775 

19.587 

5 

14.25 

21.38 

21.77 

22.32 

79.72 

19.9300 

19.697 

6 

14.14 

20.37 

20.66 

22.47 

77.64 

19.4100 

19.716 

7 

13.71 

19.95 

21.17 

20.88 

75.71 

18.9275 

19.641 

8 

14.93 

20.62 

21.21 

22.14 

78.90 

19.7250 

19.479 

9 

14.21 

20.86 

21.68 

21.86 

78.61 

19.6525 

19.240 

10 

13.89 

20.15 

20.37 

22.38 

76.79 

19.1975 

18.941 

11 

13.60 

19.54 

20.49 

20.71 

74.34 

18.5850 

18.602 

M    12 

12.81 

19.52 

19.70 

20.54 

72.57 

18.1425 

18.246 

PM      1 

13.27 

18.89 

18.36 

20.66 

71.18 

17.7950 

17.897 

2 

13.15 

18.41 

18.87 

20.32 

70.75 

17.6875 

17.579 

3 

12.29 

17.55 

17.32 

19.36 

66.52 

16.6300 

17.315 

4 

12.92 

18.84 

18.79 

20.02 

70.57 

17.6425 

17.121 

5 

13.64 

17.18 

18.55 

18.84 

68.21 

17.0525 

17.011 

6 

13.04 

17.20 

18.19 

20.40 

68.83 

17.2075 

16.993 

7 

13.00 

17.09 

17.38 

18.44 

65.91 

16.4775 

17.067 

8 

12.77 

18.19 

18.41 

20.83 

70.20 

17.5500 

17.229 

9 

12.37 

18.41 

19.10 

21.00 

70.88 

17.7200 

17.468 

10 

13.45 

17.58 

19.49 

19.57 

70.09 

17.5225 

17.767 

11 

13.53 

18.19 

19.10 

21.35 

72.17 

18.0425 

18.106 

Total 

327.65 

457.54 

474.45 

502.36 

1762.00 

18.3542 

2:(uiy) 

3.25792 

—  3.42395 

2.77825 

2.59608 

5.20830 

.10851 

2  (vxY) 

10.62339 

19.04199 

20.38840 

15.29826 

65.35204 

1.36150 
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Appendix  Table  6.     Hourly  humidity  or  mixing  ratio,  x  =  2(HjO/dry  air)    —  0.8, 

and  average  heat  exchange  per  cow,  y  =  log  (BTU/10^),  in  an  experimental  dairy 

barn  on  3  days  in  1949.  (Thompson,  1954) 


Hour 

Mixing 

ratio,  x  on 

Log-BTU 

y  on 

Adj. 

starts 

10/11 

10/30 

11/20 

Tt 

10/11 

10/30 

11/20 

Tt 

y 

3  pm 

1.3 

.5 

.6 

2.4 

.512 

.407 

.423 

1.342 

.4310 

4 

1.0 

.5 

.5 

2.0 

.484 

.415 

.447 

1.346 

.4471 

5 

1.1 

.7 

.4 

2.2 

.550 

.512 

.462 

1.524 

.4991 

6 

.9 

.8 

.4 

2.1 

.512 

.512 

.477 

1.501 

.4951 

7 

.8 

.7 

.6 

2.0 

.505 

.512 

.512 

1.529 

.5044 

8 

1.0 

.4 

.6 

2.0 

.613 

.462 

.532 

1.607 

.5341 

9 

.9 

.6 

.7 

2.2 

.607 

.498 

.525 

1.630 

.5344 

10 

1.1 

.6 

.7 

2.4 

.623 

.505 

.505 

1.633 

.5280 

11 

.8 

.5 

.7 

2.0 

.525 

.498 

.519 

1.542 

.5124 

12 

.9 

.4 

.5 

1.8 

.538 

.470 

.491 

1.499 

.5055 

1  am 

1.0 

.3 

.3 

1.6 

.550 

.470 

.484 

1.504 

.5145 

2 

.4 

.4 

.6 

1.4 

.519 

.477 

.477 

1.473 

.5116 

3 

.4 

.2 

.3 

.9 

.532 

.431 

.498 

1.461 

.5260 

4 

.6 

.7 

.7 

2.0 

.371 

.407 

.447 

1.225 

.4068 

5 

1.0 

.9 

.9 

2.8 

.447 

.491 

.491 

1.429 

.4452 

6 

.9 

.5 

.7 

2.1 

.439 

.431 

.462 

1.332 

.4387 

7 

1.2 

.7 

.5 

2.4 

.505 

.477 

.470 

1.452 

.4677 

8 

.7 

.8 

.8 

2.3 

.415 

.477 

.491 

1.383 

.4484 

9 

.8 

.1 

.5 

1.4 

.423 

.407 

.407 

1.237 

.4329 

10 

.5 

.7 

.4 

1.6 

.389 

.455 

.447 

1.291 

.4435 

11 

.8 

.5 

.3 

1.6 

.398 

.439 

.423 

1.260 

.4332 

12 

.8 

.4 

.5 

1.7 

.423 

.447 

.447 

1.317 

.4485 

1  pm 

1.0 

.9 

.7 

2.6 

.491 

.498 

.498 

1.487 

.4720 

2 

.6 

.4 

.4 

1.4 

.470 

.462 

.477 

1.409 

.4902 

Tr 

20.5 

13.2 

13.3 

47.0 

11.841 

11.160 

11.412 

34.413 

.4779 
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